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FOREWORD 


Thus issue of the Review might be labelled “Research takes time out to 
examine itself.” It is a sort of methodological résumé of the studies which 
have been reported in various fields in the preceding issues of the Review. 
It includes five types of material: (a) new developments in research tech- 
nics; (b) evaluative studies of research technics; (c) new types of appli- 
cations of given technics; (d) illustrations of current uses of technics; 
and (e) notes on needed research, especially that which is contingent upon 
the development of new approaches. 

The order of presentation of the methods and technics is: documentary, 
observational and clinical, survey, statistical, and experimental. The two 
final chapters deal with the administrative side of research. The chapter 
topics are not presumed to be coordinate; some deal with broad methods 
of work, some with narrower technics; but all have their place in a con- 
sideration of research processes. 

There have been a number of previous treatments in the Review dealing 
with research procedures, in contrast to research findings. These have 
appeared as follows: April 1932, Chapter V; October 1932, Chapter IV; 
December 1932, Chapter I; February 1933, Chapters IT and III; February 
1934, Chapters I to XIII; April 1934, Chapter II]; June 1934, Chapter II; 
April 1935, Chapter IV; June 1935, Chapter IV; October 1935, Chapter 
1; December 1935, Chapter III; February 1936, p. 54-60; December 1936, 
Chapter VII; February 1937, Chapter II; June 1937, Chapter II; June 
1938, Chapter VIII; October 1938, Chapters VII and XIII; February 
1939, Chapter V; April 1939, Chapter IV. 

The issue of February 1934 was given wholly to research attacks but 
was organized by fields of application so that the processes did not emerge 
so clearly and generally as under the present form of treatment. With the 
exception of these former treatments the present issue has no predecessors, 
and much of the content therefore has no definite earlier time limit. This fact 
has operated to make the issue longer than usual in spite of a consistent 
attempt to make each treatment as compact as possible. A number of chap- 
ters could profitably have been doubled in length. 

Turning our attention to methods of work and consciously examining 
them should increase the efficacy of research in dealing with problems. 


Douctas E. ScaTEs 
Chairman of the Editorial Board 
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INTRODUCTION 


The resrvary 1934 Review on “Methods and Technics of Educational 
Research” was organized primarily in terms of the fields to which investi- 
gational methods are applied, with chapters devoted to methods of re- 
search in such areas as the curriculum, teacher personnel, school organi- 
zation, teaching method, finance, buildings, tests, child psychology, pupil 
personnel and guidance, and school law. Such a treatment of research 
procedures is useful to the worker especially interested in a given sub- 
division of education, although it involves duplication by reason of dis- 
cussing a particular method in a number of different chapters and of 
course does not present systematically in one section the various technics 
and applications of an investigational procedure. The limitations inherent 
in the organization of this earlier number of the Review caused the com- 
mittee in charge of the present issue to outline the chapters in terms of the 
methodology of educational research. Such an organization in keeping 
with problem-solving approaches or research procedures seems a func- 
tional one, with both logical and psychological values, as illustrated by 
including in one chapter the technics of the school and community survey 
as a problem-solving mode of attack, together with appropriate examples 
of application to a number of fields. 

The authors of the several chapters were requested to stress recent 
developments in technics and new applications of each research method, 
with the present number of the Review representing a supplement to, rather 
than a duplication of, material already available in published books on 
methodology. In contrast with this issue of the Review, Part II of the 1938 
yearbook of the National Society for the Study of Education, The Scientific 
Movement in Education, deals primarily with the contribution of research 
to the numerous subdivisions of education; and the section devoted to 
methods of inquiry is not intended to serve as a detailed survey of the 
literature. The 1939 joint yearbook of the American Educational Re- 
search Association and the Department of Classroom Teachers, The /m- 
plications of Research for the Classroom Teacher, summarizes concisely 
the findings of research in the various fields of education, especially the 
school subjects, with only a few supporting references for each chapter. 

The publications characterized in this introductory statement, together 
with the present issue of the Review, present impressive evidence of a 
greatly expanded and improved literature of educational inquiry. In this 
development, methods from a variety of fields—philosophy and logic, 
mathematics and statistics, sociology and social work, psychology and 
experimental science, history, economics, law, and library science—have 
made their contributions. 

CarTER V. Goon, Chairman 
Committee on Methods of Research in Education 
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CHAPTER I 
Library and Bibliographical Procedures’ 


HERMAN G. RICHEY 


Tins cuaprer deals with the literature concerning the effective use of 
library resources in solving educational problems. It is not primarily con- 
cerned with general reference books, prepared bibliographies, or indexes 
of the literature, but with articles and longer treatments indicating how 
these and other aids may be employed to locate all materials that bear 
upon the different fields of education or upon the various aspects of those 
fields. In other words, this chapter does not deal with reference books and 
guides; rather it attempts to list and briefly evaluate the more recent ef- 
forts to make known the contributions that bibliographical and library 
aids can make to the solution of educational problems. 


Library Consultant Service 


Between 1932 and 1937, the Library Consultant Service (11) of Teach- 
ers College, Columbia University, published fourteen issues of the Library 
Consultant (30), a mimeographed bulletin designed to aid students in 
bibliographical and research activities. In 1937 the Library Consultant was 
superseded by two publications (30)—the Library Consultant Service 
Leaflet and the Library Consultant Book List. The first of these was de- 
signed to give directions for using library and bibliographical tools; the 
second, to present lists of references in closely defined subjects. 

Guides somewhat similar in nature to those published as issues of the 
Library Consultant have been published in different volumes of the Teach- 
ers College Record. Among these, Witmer’s “Educational Research; A 
Bibliography on Sources Useful in Determining Research Completed or 
under Way” (45), Witmer and Miller’s “Guides to Educational Litera- 
ture in Periodicals” (46), and Witmer and Miller’s “U. S. Office of Edu- 
cation Serial Publications” (47) are useful. Witmer and Feagley’s “A 
Beginner’s Guide to Bibliography” (44), published separately, is a simi- 
lar aid. 

Carter Alexander, appointed a member of the Consultant Service in 
1932 to act as guide and mediator in giving bibliographical assistance to 
students of school administration, a few months later became the pioneer 
library professor attached to the library of an institution for training in 
educational research (5). As library professor the scope of his work was 
enlarged to include giving bibliographical aid to all students engaged in 


1 Bibliography for this chapter begins on page 591. 
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research, giving instruction in library resources and methods, and pre- 
paring teaching materials in library procedures. 


Basic Teaching Materials 


In 1935 Alexander (4) published a text and reference book, which was 
the outgrowth of his work as library consultant and professor. This vol- 
ume made available for all students in the field of education the specialized 
knowledge of the reference librarian that related to education and indicated 
how such knowledge could be made to serve investigators of educational 
problems. The text was accompanied by a book of exercises (2) designed 
to develop skill in locating and using the library resources described in 
the text. The textbook may be supplemented by Educational Research (3) 
by the same author; the articles by Witmer and her associates (44, 45, 46, 
47); the various issues of the Library Consultant (30); Monroe, Hamil- 
ton, and Smith’s Locating Educational Information in Published Sources 
(34) ; Abel’s “Guides for Studying Comparative Education” (1) ; Edwards’ 
“Where and How To Find the Law Relating to Public School Adminis- 
tration” (19); Townsend and Stewart’s Guides to Study Materials for 
Teachers (41); Foster and others’ “Aids in Selecting Enrichment Mate- 
rials” (20); and by several guides developed by students under the direc- 
tion of Alexander. 

Students working under Alexander have compiled excellent guides to 
library materials on the following subjects: the curriculum (18), elemen- 
tary education (32), health and physical education (12, 16, 39), locating 
the school law (14), periodical literature on natural science (29), Negro 
education (15), penal education (36), rural education (13), secondary 
education (33), speech education (40), education of teachers (28), voca- 
tional research and criteria for measuring vocational success (17, 35), 
public school administration (10), the handicapped child (31), and in- 
dustrial arts education (23). 


Library Procedures in General Treatments of Research Method- 
ology 


Several textbooks dealing with research methods and technics included 
useful sections on the library. Good, Barr, and Scates (22) discussed 
sources of information and types of educational literature, guides to edu- 
cational literature, periodical literature in education, systematic organiza- 
tion of educational literature, and other topics. Headley’s chapter “How 
To Use the Library” (24) and the fifth revision of the manual by Hutchins, 
Johnson, and Williams (26) provided basic instruction in general library 
procedures for students less advanced than those for whom Good and his 
associates wrote. The text by Good, Barr, and Scates should also be supple- 
mented by the writings of Reeder (37), Whitney (42), Almack (6), Good 
(21), and others. 
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Library Procedures in Special Aids 


Valuable aids were contained in highly specialized treatments of library 
resources and methods, such as government publications and library hand- 
books. 

Government publications—Schmeckebier’s Government Publications and 
Their Use (38) and Boyd’s United States Government Publications as 
Sources of Information for Libraries (9) provided not only a list and de- 
scription of materials, and a bibliography of guides available for use in 
connection with them, but also discussions of the use, classification, and 
availability of the materials and guides. These volumes were supplemented 
by some of the volumes of the annual publication, Public Documents (8), 
Wyer’s U. S. Government Documents (48), and similar works. Higgins’ 
Canadian Government Publications (25) is indispensable to the research 
worker making use of published information from the Canadian gov- 
ernment. 

Handbooks—Handbooks have been prepared for use in connection with 
some libraries (27, 43). These are specially helpful in the libraries for 
which they were written and some have been made, with minor modifica- 
tions, to serve more generally (26, 27). 


Future Developments 


Within the next few years there promises to be an extensive develop- 
ment of literature on library methods and on procedures centering about 
the use of the camera which was until recently employed by scholars only 
in notetaking and manuscript collation. However, with the improvement 
of the projector and its adaptation to microphotography, library use of 
the camera has been extended to “(1) making library holdings available 
outside, in extension of interlibrary loans; (2) commanding the basic 
materials for research, especially the bulky records of the social sciences 
and the humanities; (3) preservation of the perishable, such as news- 
papers; and (4) making available the results of research, such as Science 
Service offers in partnership with science editors to preserve and film in 
full on demand any paper they print in abstract” (7: 1936, p. 120). 

At present the literature on microphotography is, for the most part, 
concerned with the problems of librarians; but the work being done in 
numerous centers brings nearer the time when scholars will need indexes, 
union lists, and other guides to the use of microfilms and some instruc- 
tion in the methods and procedures involved in their use. 
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CHAPTER II 


Current Historiography in Education’ 


H. G. GOOD 


Ths CHAPTER is concerned with problems and trends in the writing of 
educational history rather than with the content of such history. Recent 
contributions to the history of education were dealt with at length in the 
October 1939 issue of the Review of Educational Research. It is not there- 
fore the province of this chapter to attempt any review of recent histori- 
cal writings with a view to giving a résumé of their findings. It is, in- 
stead, our purpose to attempt to observe certain developments in the 
methods and interests of educational historians, and historical contribu- 
tions will be cited only as illustrations of these observations. 


Treatises on Historical Method 


In the past few years a number of treatises on historiography have ap- 
peared. While the principles of historical research have remained prac- 
tically the same for some time, elaborations of these principles and vari- 
ous specific applications of them are presenting new aspects. Several books 
on educational research methods (50, 68, 94) have contained chapters 
relating the historical technics particularly to educational materials and 
problems. Detailed bibliographies will be found in these chapters, citing 
a wealth of supporting material. No book dealing solely with the prob- 
lem of research in educational history is known, but the principles of gen- 
eral historical research and history writing are presented in a number of 
recent texts which the research worker should consult. The most helpful 
of these are: 51, 53, 57, 61, 73, 76, 79, 83, 87, 91. Space is available for 
only a composite summary statement of the content presented and the 
principles developed in these recent books on historiography. We may con- 
veniently take this from a recent review by Good (67) : 


Documents and remains are the chief primary sources, the first witnesses to a 
fact and therefore the only solid bases for historical research, although classifications 
of sources have been broadened within recent years. In the case of secondary sources, 
more than one mind has come between the historical event and the user of the sources. 

The sources are subjected to external and internal criticism. External criticism is 
concerned with the genuineness of the document as such, and deals with data relating 
to form and appearance rather than meaning of contents. Internal criticism deals with 
the meaning and trustworthiness of statements within the document—that is, it weighs 
the testimony of the document in relation to the truth. External criticism makes use 
of certain auxiliary sciences and a variety of procedures in dealing with forgeries and 
hoaxes, inventions and distortions, authorship and time, and borrowings. Interna! 
criticism seeks to determine the literal meaning and the real meaning of statements, 
the competence of the observer, the truthfulness and honesty of the observer or author. 


1 Bibliography for this chapter begins on page 59S. 
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Historical composition is a synthetic and constructive process which involves the 
mechanical problem of documentation, the logical problem of selection and arrange- 
ment of topics and subtopics, and the philosophical problem of interpretation. An 
appropriate combination of the chronological and topical plans for arrangement of 
topics and materials is recommended. Few works in history have been directly touched 
by one of the broad philosophies or theories of history, such as the evolutionary hypothe- 
sis or the rhythm-philosophy (cycle theory). Specific schools of historical interpre- 
tation, such as economic determinism, permit a pragmatic test, through use of historical 
materials, and are of more general interest to writers and users of history, although 
many of the best historical works have been written according to the individual bent of 
the author rather than in keeping with some special interpretation or school of thought. 
The development of a newer type of history, which is eclectic in approach and inter- 
pretation, necessarily depends on the contributions of many sciences for gathering and 
interpreting evidence and for training its workers. 

Only when a perplexing question has been identified and correctly stated does 
profitable study of history begin. Inductive reasoning is, of course, the procedure open 
to the historian, in making penetrating inductive inferences from known facts which 
offer only a partial explanation. In turn, the superimposing of the general explanatory 
concept upon the facts or the testing of the working hypothesis represents a deductive 
process. As a rule, multiple causation is the explanation of any important historical 
event. The problem of historical perspective represents great difficulties because of 
tendencies to evaluate events and personages, distant in time or space, according to the 
standards of our own time and culture. Few histories of distinction lack a thesis or 
principle of synthesis, such as the effect of the frontier on American life and character. 
The literary aspects of historical writing include consideration of mastery of materials, 


the working outline, the principle of progression, emphasis on major elements, and the 
art of narration and dramatization. 


New Basic Materials for the Historian 


Rapid technological development is taking place in methods of repro- 
ducing all kinds of documents, as lithoprinting, photo-offset, microcopying, 
sound recording, and others. Binkley (54) reported on these for the Social 
Science Research Council. The Union Catalog of the Library of Congress 
enables one to locate rare books in five hundred American libraries. This 
is an author catalog. Large libraries contain similar finding lists for other 
types of material, such as the Union List of Serials (70), the American 
Newspaper Files (69), the Writings on American History (72), and the 
several publications on manuscript collections. One of the most valuable 
new tools is the Dictionary of American Biography (77) in twenty vol- 
umes, completed two years ago at a cost of $650,000, and containing the 
biographies of 13,633 distinguished Americans. One in every ten of these, 
that is about 1,300, filled important educational positions as teachers, 
presidents, or professors in colleges and schools, scientists, and other re- 
search workers. There is not space to speak of the individual biographical 
works of recent years or of the excellent histories of higher institutions, 
but Morison’s work (81, 82) on the history of Harvard, and MacCracken’s 
guide (80) to American colleges and universities must be named. 

A recent development which is of major importance to the history of 
education as a scholarly pursuit and to all scholarly work in the United 
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States is the establishment of numerous university presses and the publica- 
tion activities of the educational foundations and of the Council of Learned 
Societies. The reader may find it interesting to check this remark against 
the bibliography for this chapter. 


Significant Historiography 


The history of education deals with problems old and new. Perhaps no 
problem can be disposed of finally and forever. Not only are previously 
unknown or unused documents becoming available, but new interpreta- 
tions also become necessary as the climate of opinion changes. A school 
which a century ago taught the three R’s and religion—otherwise the four 
R’s—and taught them well according to the demands of that day (by 
mechanical and drill methods) was considered a good school. Not so 
now; we do not judge schools as they were judged in 1840. The same 
applies not only to the schools themselves but also to the relations between 
the schools and society. We now believe schools must do their share in 
preparing the young for a better society if ever there is to be a better 
society. They must do their work effectively but also tactfully within the 
framework of the present society. This is really the problem of problems 
for American education and American life today. 

The problem is a basic one in democracy. How can the schools at the 
same time be conservative and progressive, responsive to the people’s will, 
and yet leading on to better things? Historical studies which deal realisti- 
cally with education must include these relations of the school to society, 
and especially to the social pressure groups and special interests. The 
growing concentration of economic power without appropriate social re- 
sponsibility is an illustration of the problem of special interests versus 
the interests of the whole people. 

Some of the recent social histories of education have been quite as 
realistic as the histories of administration and legal aspects. Some years 
ago Carlton (55, 56) studied the effect of economic influences upon edu- 
cation itself and the educational program of the Workingman’s Party. 
Curoe (59) made a much more extensive study of the educational policy 
of organized labor. Several of the volumes issued by the Commission on 
the Social Studies belong to the present category. Curti (60) wrote the 
history of the social ideas of ten American educators, Pierce (84) the his- 
tory of propaganda in schools, and Beale (52) the history of prejudice and 
pressure in education. 

A telling biography by Garber (65) is on the life of J. C. Kilgo, some- 
time the fighting president of Trinity College (now Duke University), 
North Carolina, who against a rabid press and public defended one of his 
professors who had ventured to write in praise of Booker T. Washington. 
Gellerman’s study (66) of the American Legion, an economically con- 
servative propaganda and pressure group, showed that the Legion is not 
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representative of the ex-service men of whom only one-fourth or less belong. 
Tryon (90) wrote for the Commission on the Social Studies a history 
of history-teaching. Culver (58) prepared an extended study of the attacks 
upon Horace Mann by sectarians, some of them inspired by book com- 
panies. Stauffer (88) wrote on the influence of the Bavarian Illuminati 
and Koch (78) on the Cult of Reason with the decline of Puritanism in the 
later eighteenth century. Willey (95) reported on the current depression 
and its influence in the field of American higher education. Hollis (74) 
offered a history and an analysis of one hundred larger and smaller edu- 
cational foundations. Each of the two largest has control of a capital of 
$150,000,000, and there are many which direct smaller but still very 
large sums. 


Emphasis on School Organization, Administration, and Law 


Carrying out the tendency toward realism in educational history, many 
recent studies have dealt with the organization and administration of schools 
and with school law. We may cite some illustrative studies. Reller (86) 
dealt with the origins and development of the city superintendency. Pierce 
(85) made a study of the history of the principalship in twelve cities. 

Williams (96) traced the development of the state board of education 
in California, noting the trend toward centralization. Almack (49) dis- 
cussed the development of school administration in general. Griffey (71) 
studied the history of local school control in New York State, and Holt 
(75) the twentieth-century progress in establishing a more effective sys- 
tem of schools in Tennessee. 

One of the early students of the history of school law was Elsie Clews 
who dealt with the Colonial period. Recently such studies have multi- 
plied. Elliott and Chambers (63) presented a selection of charters, con- 
stitutional provisions, and court decisions relating to fifty-one representa- 
tive institutions. The same authors in another volume (64) gave analyses 
of court decisions and treated their bearings upon college and univer- 
sity problems. Works by Edwards (62), Voorhees (92), Trusler (89), and 
Weltzin (93), while primarily reference or textbooks, represent research 


in the field of school law and indicate a growing trend in documentary 
study. 
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CHAPTER III 
Legal Research in Education’ 


M. M. CHAMBERS 


Tae RESULTS OF RESEARCH in the legal aspects of education have been 
summarized in previous issues of the Review. In contrast to such summaries 
the present chapter is concerned with the methods and materials of legal 
study and refers to the summaries of studies only as examples of available 
materials. 

Research methods used in the field of educational law were discussed in 
the Review for February 1934 by Edwards (111). Treatments of the same 
subject have been published elsewhere by Alexander (97), Chambers 
(100), Coffey (106), Cyr and Cunin (108), Edwards (112), and Good, 
Barr, and Scates (114). All these have presented useful outlines of the 
classifications of law, the sources of school law, and the indexes and other 
bibliographic aids available in law libraries. It would be out of place here 
to reproduce any such comprehensive guides. Instead, a fivefold task is 
undertaken: (a) to summarize some new approaches and uses of legal re- 
search in education; (b) to note some recent changes and additions among 
the legal bibliographic aids; (c) to offer certain suggestions regarding the 
analysis and digesting of legal materials; (d) to discuss needed research 
and the organization of facilities and personnel therefor; and (e) to 
present a synthesis of the place of legal research in education. 


New Applications of Legal Research in Education 


The broad study of federal relations to education conducted by the Ad- 
visory Committee on Education under the chairmanship of Floyd W. Reeves 
involved some legal research in almost all its phases. In addition to its 
attention to legal backgrounds, the Committee caused to be made a special 
study of selected legal problems in providing federal aid for education. 
In this undertaking Hamilton and Mort (117) found “no insurmountable 
obstacle to ultimate nationwide equalization” of school support; that “the 
federal aid statute should be drawn with sufficient particularity to authorize 
the use of federal funds for purposes which Congress deems educationally 
desirable but which may not, under present state law, be accomplished in 
some states through the use of state funds”; and that “to assure the avail- 
ability of federal funds for transportation purposes the federal act should 
either (a) expressly provide for such use, or (b) require that the states 
bring their transportation statutes into accord with the purposes of the 
federal grant.” 

The legal vocabulary of school administrators was studied by Kephart 
(118) through the construction and use of a vocabulary test based on the 
commoner sources of school law. He concluded that the average school- 

1Bibliography for this chapter begins on page 594. 
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man’s mastery of technical legal words and phrases is only about 60 percent 
of what it should be, and recommended that the deficiency be remedied in 
courses for administrators. 


Summaries of Legal Studies 


At least thirty-four doctoral dissertations in the field of school law have 
been completed in American universities during the years 1936, 1937, and 
1938. Abstracts of all these have been collected and published by Good 
(115, 116), who also listed 119 dissertations completed during the period 
1918-35 (113). The dissertations of the three years just past may be classi- 
fied crudely by topic as follows: 


State histories of legislation or court decisions......... 7 
School finance, property, or business management...... 13 
ee in ahead bier ieee $4.00. 66 om. a0 7 
NE a I a a a 2 
Public support for private or parochial schools......... 2 
eee SE ee 1 
Relationships between cities and school districts........ 1 
Intercollegiate athletics ........................00005 1 

PE aN 2th. Se wad ie hWadkd~ acs da's.e's 34 


Two areas in which there have been important developments but which 
do not appear among the topics studied are the administration and in- 
terpretation of teachers’ tenure statutes and state regulation of the selection 
and provision of textbooks. Inspection of Good’s abstracts (115, 116) will 
disclose other gaps in the dissertations, constituting wide-open opportu- 
nities for new studies. 

Twenty-eight masters’ theses completed in 1935 and 1936, and eighteen 
completed in 1937, were also listed by Good (115, 116). For summaries 
of earlier research in educational law resort may be had to an issue of this 
Review devoted wholly to the subject (110), and to chapters in more 
recent issues dealing respectively with teacher personnel (102) and the 
school plant (99). The series of Yearbooks of School Law (105) edited 
by Chambers with the collaboration of a number of specialists carry topical 
summaries of changes in educational laws and have appeared annually 
beginning with 1933. 


Changes and Additions among Bibliographic Aids 


Among the legal encyclopedias it is to be noted that Corpus Juris was 
completed in 1935 with its seventy-first volume, and its publishers immedi- 
ately instituted a new work which will eventually supersede it except for 
historical purposes, entitled Corpus Juris Secundum (107). This work has 
not as yet proceeded far enough in the alphabetical list of topics to reach 
“Schools and School Districts,” and hence Volume 56 of Corpus Juris 
and the annual supplementary volumes are still in current use; but a com- 


461 








SUSE Atle a: 


spy 


sega a ee ones 


AON CN ttn Rai: oe 

















Review oF EpucaTionaL REesEARCH Vol. IX, No. 5 





prehensive new article on “Colleges and Universities” has already appeared 
in Volume 14 of Corpus Juris Secundum, bearing a date of 1939. 

Likewise, the well-known Ruling Case Law, completed in 1930 with the 
publication of the eighth volume of its Permanent Supplement, has for its 
successor a new work entitled American Jurisprudence (98), of which 
twenty-one volumes have been published since 1936. Neither “Schools” nor 
“Universities and Colleges” has as yet been reached in this series, but both 
will in due course be covered in new articles. 

The fact should not be overlooked that there are for some states encyclo- 
pedias of the law of that jurisdiction only, such as California Jurisprudence, 
Texas Jurisprudence, and Ohio Jurisprudence. In such works the article 
devoted to “Schools” is a veritable textbook of the school law of the state, 
as well as a research tool carrying citations which will lead the student 
promptly to the important decisions in that jurisdiction. 

In the American Digest System, the end of the decade 1926-36 brought 
the appearance of the new Fourth Decennial Digest, consisting of thirty- 
four volumes, “Schools and School Districts” being in Volume 27. Further- 
more, the name of the monthly Current Digest was changed in 1936 to Gen- 
eral Digest, and the monthly issues are now being cumulated at intervals of 
five months instead of semiannually as in recent years. 

Various State Digests continue to be published from time to time, and 
despite the excellence and comprehensiveness of the American Digest Sys- 
tem these compilations are invaluable keys, due to the fact that they cover 
decisions of the general trial courts as well as those of the appellate courts. 
It is not practicable here to list them by title and publisher, but the thor- 
ough student of school law is advised to ascertain what state digests are 
available for any particular state he may be studying in detail. 

Having mastered the technics of finding judicial decisions, the research 
worker should read the same case, when possible, in both the appropriate 
unit of the National Reporter System and in the State Reports, due to the 
fact that the latter often carry headnotes written by the court, and frequently 
print arguments of counsel which sometimes throw much light on the case 
and also cite additional sources. 


How To Read and Abstract a Judicial Opinion 


Much of Anglo-Saxon law is judge-made, because statutes must be fre- 
quently construed by the courts in litigated cases, and because the courts 
are called upon to decide many controversies not covered by statutes at all. 
Thus the ability to analyze and digest, compare and contrast, interpret, and 
correctly abstract the recorded opinions of the courts is of prime importance 
in school law research. Ten suggestions concerning how to study and use a 
decision, once it is found, are offered at this point (100: 17) : 


1. Observe what court is being reported and, if the case is on appeal, from what 
court or courts it has been appealed. 

2. Observe the form in which the action is brought—whether it is a petition for a 
writ of mandamus, injunction, quo warranto, or prohibition; or a suit for pecuniary 
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damages in tort or in contract; or an action in equity for the specific performance or 
reformation of a contract or for an accounting of partnership or corporation affairs; 
or a criminal action brought in the name of the state against a defendant accused of 
violation of a penal statute. 

3. Segregate and digest the statement of the facts of the case, before trying to 
understand the application of the law to the facts. 

4. Determine what is the precise question of law which the court is called upon 
to decide. The court’s answer to this precise point is the decision of the case. 

5. Note whether the decision seems to be in harmony with any broadly accepted rule 
of law. (Often the court will state the broad rule.) 

6. Note whether the court indulges in any discussion of points on which it is not 
called upon to decide. Judicial pronouncements not directly related to the ratio 
decidendi, or determination of the legal issue in the case, are called dicta, and rank 
much lower than the decision in point of weight as precedents. Nevertheless, a mere 
dictum often is a figurative bomb packed full of brilliant philosophy which will illumi- 
nate its area of the law for years to come. 

7. In view of the fact that all courts of last resort and most appellate courts are 
collegial—that is, consisting of several judges sitting en banc—note whether the deci- 
sion and the opinion are concurred in by all the judges sitting or whether one or more 
judges have filed specially concurring opinions or dissenting opinions. These may be 
important for the sake of the social philosophy they express. A large proportion of the 
essential wisdom to be extracted from the records of the United States Supreme Court 
for a generation past is to be found in the long line of brilliant dissenting opinions by 
the late Justice Oliver Wendell Holmes. These opinions were frequently concurred in 
by Justice Brandeis, later by Justices Stone, Roberts, and Chief Justice Hughes, finally 
coming to represent the social philosophy of a majority of the court in many particulars, 

8. Try to orient the case in your own social and legal philosophy; evaluate the deci- 
sion and the opinion critically, sympathetically, tolerantly; struggle to interpret it and 
express it vividly and meaningfully; make it live in the mind of your reader. 

9. Get the complete and correct caption and citation, including date. 

10. Jot down the supporting authorities cited in the course of the opinion and follow 


them up, thus developing the history of the principles of law involved and disclosing 
trends. 


Needed Research, Facilities, and Personnel 


One hundred and ten current problems in educational law were listed 
by Chambers and associates (104). The list was drawn up from a nation- 
wide viewpoint and could be greatly extended if made to include specific 
problems peculiar to particular states and regions. Some inkling of the 
fact that the solution of these problems has significant bearings upon the 
future of American democratic society may be had merely from a moment’s 
reflection upon the first question in the list: “Shall any state permit any 
feature of its laws relating to residence tuition, or transportation of pupils 
to remain such that any boy or girl of secondary-school age is denied rea- 
sonably easy access to suitable high-school facilities?” 

Another brief published discussion of issues in school law (101) treated 
four great areas in each of which are many unsolved problems: (a) teacher 
tenure legislation and its interpretation, (b) relations between the public 
and private schools, (c) the extension of the public school program both 
upward and downward on the scale of age, and (d) the relationships of 
the public schools to the kindred social welfare services. Three other equally 
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important areas are also mentioned: (a) federal relations to education, (b) 
the evolution of state systems of school finance, and (c) the changing rela- 
tionships between public school authorities and state and local noneduca- 
tional fiscal authorities. 

Three to four hundred decisions of cases involving one or more discrete 
issues in school law are handed down by the higher courts of the states and 
the federal tribunals each year (103). The number of cases in the lower 
courts of record is, of course, much larger; and in a single populous state, 
such as Ohio, the attorney general’s office renders about one hundred opin- 
ions on school law questions annually. 

Important to the improvement of the quantity and quality of research on 
the legal basis of education is the assignment of appropriate personnel in 
graduate schools and in state education departments, and the organization 
of research agencies at the national level. Pre-doctoral research in school 
law should have the guidance preferably of professors broadly trained in 
education, law, and political science. In the absence of any one such member 
of a graduate school staff, a joint committee of carefully chosen professors 
in each of the three fields suggests itself. 

Every state education department should have one staff member especially 
competent in school law, with facilities for research and publication in that 
field. All quasi-judicial decisions, administrative orders, and important 
correspondence of the chief state school officer involving the interpretation 
of the school statutes should not only be kept in files available to the 
public for legitimate purposes but also should be printed and bound at 
suitable intervals and made available to school personnel throughout the 
state. Current items of importance should be regularly published in the 
official periodical of the state education department. The cost of these serv- 
ices would be more than recouped in the lessening of unnecessary corre- 
spondence and the reduction of needless uncertainties in the minds of local 
schoolboard members and superintendents and teachers throughout the 
state. 

At the national level some commendable but necessarily fragmentary 
research is carried on by the United States Office of Education, the Na- 
tional Education Association, the American Council on Education, and 
other agencies; but in each of these the assignment of personnel and the 
facilities for publication are as yet too limited for the type of fundamental 
and continuous attack which the broad field of American school law merits. 
There are some possibilities of better coordination of the present work of 
the several existing agencies, but the best hope for the development of a 
more nearly adequate service rests in the evolution of some one office, 
sufficiently supported from governmental or philanthropic sources or both, 
with directing personnel, staff, and facilities for research and publication 
commensurate with the great significance to our democratic society which 
inheres in the evolution of the legal basis of education throughout the 
nation. 
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The Place of Legal Research in Education 


A composite statement on the importance of research in school law may 
be had by juxtaposing four brief quotations. The first is from Alexander 
(97: 200) : 

The world over, law has a unique value for all concerned with social institutions 
like the school. While much hasty and unwise legislation passes in many countries, 
nowhere outside the law, which Woodrow Wilson defined as “crystallized custom,” are 
the principles of human relationships so well stated. Accordingly, the educator con- 
cerned with securing fundamental principles for any phase of education will do well 
to go promptly to the chief legal formulations of conduct involved. 

In the United States, law has another great importance for the educator. It represents 
the aspirations of the American people. 


The second is from Cyr and Cunin (108: 509) : 


Legal research constitutes one of the important fields of educational research. Public 
schools and school districts are in the last analysis creatures of the law. Their adminis- 
trative structure, functions, and authority are dependent on laws and legal decisions. 
With forty-eight states each developing its own legal system, in addition to the federal 
government, a wide variety of educational practices has developed during the last 
century and a half. Examination of the historical development of school law in the 
United States and comparative studies of the legal provisions in the different states, as 
well as regional differences throughout the nation, offers a rich field for the educational 
research worker. . . . Such research is needed to provide a body of knowledge for those 
who are charged with the responsibility of making new laws and modifying old ones. 


The third is from Good, Barr, and Scates (114: 270): 


Although the concern of the school administrator and board member with school 
law is apparent, the interest of the classroom teacher in such problems can be justified 
without much difficulty. The right of the teacher to control children going to and from 
school, the teacher’s liability for injury to children while at school, and the influence of 
the courts upon the curriculum are all legal problems. There seems no logical reason 
why teachers and field workers should feel limited in their investigations to studies of 
presentday teaching methodology and curriculum reorganization. 


The fourth is from Chambers (100: 1, 19): 


Hitherto the two professions of education and law have developed, for the most 
part, quite apart from each other. Each has its own literature, its own bibliographical 
methods, its own technical vocabulary, and its own methods of approach to research. 
For most of the members of either profession, the other is as a closed book not to be 
opened—as an unfathomable mystery not to be explored by anyone other than its own 
professional devotees. Every student of human relations today can observe that this 
compartmental exclusiveness with which different departments of human knowledge 
have developed must be to some extent broken down if we are to produce leaders who 
are able to survey and comprehend the complex problems which overlap into all the 
fields of social science. Consequently, the student who ventures into the field of school 
law may be assured that he is undertaking a real service to the advancement of knowl- 
edge. One who works in this area should set for himself the goal of mastering the 
methods and meeting the standards of both professions insofar as they apply to his own 
scholarly work. . . The field of school law is immense and largely unworked. 


Edwards (109) contributed another excellent exposition of the basic 
nature of the American law of public schools, too extensive to permit of 
summarization at this point. 
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CHAPTER IV 


Quantitative Analysis of Documentary Materials’ 
EDGAR DALE 


Tue HISTORY OF EDUCATIONAL RESEARCH discloses many quantitative 
studies of the content of textbooks, errors in grammar and spelling, analyses 
of technical and nontechnical vocabularies, the mathematics used in every- 
day life, and the like. This general area of research was summarized, with 
illustrations, by Good, Barr, and Scates (133). They recognized textbook 
analyses, analyses of large bodies of literature, activity analyses, vocabulary 
analyses, error studies, and analyses of record and report forms. To these 
we should probably add the quantitative analysis of motion pictures. 


Limitations of Quantitative Documentary Analyses for Curriculum 
Making 

The value of such studies for curriculum building has been questioned 
by some who believe that little educational guidance can be secured from 
them. The questioning may be valid if one assumes that the results of these 
analyses are to be taken directly as objectives of instruction. But it is only 
when they are regarded as data to be used as a suggestive guide in formu- 
lating certain of the objectives of education that they have significant value. 
For example, the teacher or curriculum maker should know the names, 
places, and events frequently mentioned in periodical literature. Yet these 
do not immediately become the goals of learning. They must be subjected 
to logical analysis first. 

One of the objections to the use of results from such studies is that they 
yield “status” data. Bagley (120) made this clear in his comment on his 
own study of geographical references in newspapers and magazines. He 
said: 

. any method that attempts to utilize literature as a criterion for the selection of 
educational materials should be applied with a distinct understanding that it may 
simply result in a circular form of reasoning; current literature of a “general” nature 
is likely to represent pretty accurately “general” education. In some respects, it is just 
as valid to infer from the content of the school program what the character of current 
literature will be as to infer from the character of current literature what the content 
of the school program should be. Certainly, if there is a causal relationship, it is from 
the school to current literature, and not vice versa. (120:133.) 

A second objection to analyses of literature, of errors, or of activities is 
that one cannot infer importance directly from frequency. A third objection 
is that such studies tend to place an emphasis on the more mechanical 
aspects of learning and fail to reveal the need for learning the more 
dynamic elements of behavior. There are other objections also. They are 
discussed at some length by Good, Barr, and Scates (133) together with 
the entire problem of interpreting such studies. 

1Bibliography for this chapter begins on page 595. 
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It should be noted that frequency counts of documentary material may 
be put to some very different uses, which avoid most or all of the criticisms 
referred to above. These will be mentioned later. We shall proceed to some 
of the technical problems of frequency studies. 


Delimiting the Problem and Classifying the Cases 


When data of the discrete type considered in this chapter are collected 
and analyzed, it is imperative that the problem and its limits be stated with 
extreme clarity. In the field of spelling, for example, it is not enough to 
state that the purpose of a study is to ascertain the words most commonly 
used in writing. Shall it be adult writing or child writing or both? Does 
one wish to include writing such as occurs in a diary, marginalia, or 
“doodling”? Shall personal letters be included, and, if so, is there a pos- 
sibility of inadequate sampling of certain types, that is, intimate personal 
letters? Shall the writing studied include materials prepared in school on 
the basis of teacher assignments? Obviously, the investigator may make his 
study as broad as he wishes, but unless he states his limits carefully, diffi- 
culties may arise. 

Classification illustrated by motion picture studies—One of the most 
vexing problems faced by an investigator who is collecting data is the 
classification of his materials, that is, what categories to employ in making 
his tabulations. So long as an extensive number of categories does not in- 
crease tabulation difficulties unduly, one should err on the side of com- 
pleteness because one can combine categories later on if necessary. There 
is a good deal of room here for preliminary exploration. The writer dis- 
covered, for example, after considerable study, that newsreel materials were 
advantageously divided into twenty-eight categories. Furthermore, it was 
not difficult to train new tabulators to recognize these categories and to 
classify accurately the materials which they had to handle. An important 
element which should be considered in one’s classification is the degree to 
which it will answer significant questions that are being raised about the 
object or condition studied. One common question asked about newsreels 
is the extent to which they are monopolized by sports and military activi- 
ties. Our classification has made it possible for us to show clearly that 
sports are the most common in all newsreels, that war and military prepa- 
ration are usually second or third in amount, and that such topics as agri- 
culture, dairying, and ranching usually rank at the bottom of the list. This 
classification scheme enabled us to ascertain that in a recent four-week 
period over half of the newsreels dealt with war or the preparations for 
war, and of these items 34 percent dealt with the United States, 56 percent 
with the Allies, and 10 percent with Germany. 

Another classification of the themes of motion picture feature films has 
been carried forward over a period of years (129). These data indicate 
the proportions of pictures dealing with the following categories: crime, 
sex, love, mystery, war, children, history, travel, comedy, social problems. 
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Classification illustrated by word counts—In Thorndike’s counts of the 
frequency of words (142), except for very special reasons, separate entries 
were not made for plurals in s, plurals where y is replaced by ies, adverbs 
formed by adding ly, and the like. To correct for the inaccuracies involved 
in tabulating as the same identical words all words which are spelled alike 
but have different meanings, such as bear, a further study was made by 
the Institute of Educational Research, called the “English Semantic Count” 
(138). This analysis used the index provided by the Oxford English 
Dictionary. In deciding questions of classification one must consider the 
use which is to be made of the results. For example, in a spelling study 
where one is attempting to discover the words used by children and adults 
in their writing it is important to count as different words such variations 
as “write,” “writing,” “written,” and “wrote.” In a vocabulary analysis, 
however, it might be less important to distinguish between these various 
forms; they might all be counted under the word “write.” The general 
problem of classification has been further discussed by Scates (141). 


Stating the Assumptions 


It is important that the investigator indicate the basic assumptions which 
underlie his study or which will attend the use of the findings. For ex- 
ample, if studies of the grammatical errors made orally by school children 
are utilized for building a grammar curriculum, it is assumed that an 
important function of the study of grammar is to improve speech. A sub- 
sidiary assumption is that if the most common oral grammatical errors 
are known by the teacher and are directly attacked in the curriculum the 
grammar of speech will be improved. It is sometimes further assumed that 
the sentence context in which grammatical errors are made is not signifi- 
cant; hence, only the specific errors are collected. Statistical assumptions, 
especially those relating to sampling, also ought to be clearly stated. 

The statement of assumptions serves two important functions. It some- 
times reveals to the investigator the untenability of certain of his assump- 
tions and clarifies the problem for him. It assists the research consumer 
in determining the extent to which he will accept the findings, because it 
enables him to evaluate the assumptions. 


Sampling Problems 


How much data should one obtain? The answer lies in the degree of 
reliability which will suffice for one’s purpose. It is possible in most cases 
to determine statistically the size of the sampling necessary to secure the 
desired reliability. For example, Thorndike (142), in his study of the 
frequency of words in reading materials, secured a measure of reliability 
by getting two sums from random halves of sources and computing the 
median probable displacement from the position that a word would have 
if obtained from an infinite number of such counts. He discovered that 
about twenty-five of the words rated in the first five hundred would be 
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placed elsewhere by an extensive count. Further, as Thorndike stated, 
“Counts of a million books would probably replace only about 1,000 of 
the 20,000 words by others.” 

Before the sampling is begun the investigator ought to list every known 
factor that may influence the nature of the data to be examined. For ex- 
ample, in the study of the vocabulary of individuals, sex, age, geographical 
location, and other factors are likely to influence the data. The investigator 
should seek to make up the sampling study in such a way as to include a 
representative proportion of cases falling in each category of the factors 
previously listed. It may be desirable to tabulate certain groups separately, 
as different sexes, different ages, and the like. Such data can always be 
combined later but cannot be unscrambled if one fails to keep them sepa- 
rate. Subsequent investigators may desire to make use of the separate group 
data. In areas such as the study of spelling or the study of vocabulary, 
this point is sometimes important. 

When one realizes that Gallup (119) secures reliable data by querying 
at times fewer than 10,000 persons carefully selected throughout the United 
States, it is seen that the skilful application of statistical principles of 
sampling will often eliminate the need for collecting huge quantities of 
data. By keeping comparable halves one may be able to determine when 
statistically reliable data have been secured. Furthermore, in studies seeking 
only percents of occurrence, one can eliminate bit by bit those cases for 
which adequate sampling has been made. 


Tabulating the Data 


The problem of tabulation can be ameliorated by these devices: (a) by 
having tally sheets for the most frequently appearing items close at hand 
for tabulation, for example, a separate checking list for the most frequently 
appearing items; (b) by watching the sampling carefully and stopping the 
tabulation for the most frequent items at a time when additional data will 
not significantly change the frequency proportions; (c) by experimentally 
developing mechanical equipment which might eliminate some of the time- 
consuming activities of tabulation; (d) by using available equipment, as 
the Hollerith electrical scoring and counting machines. 

Horn (137) had spelling words recorded on sheets 81% x 13, indexed on 
the left-hand side of the sheet with beginning letters of key words such as 
ab, ace, add, aff, ag, and the like. Preliminary experimentation indicated 
the amount of space necessary between the tabs. A series of sheets indexed 
and spaced to hold 10,000 words was found to be most satisfactory. When 
the sheets containing approximately 10,000 running words became crowded, 
the words were then transferred to larger series of sheets holding about 
100,000 running words. Each tabulation sheet was carefully marked to in- 
dicate the kind of material from which the words were taken. Horn indicated 
that an unsatisfactory attempt was made to use Tidyman’s method of 
putting each word on a separate card and then sorting the cards. 
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Experimental work is going forward with the use of Hollerith equipment 
in the tabulation of errors. Greene (135) reported, for example, that more 
than 1,500,000 running words of the actual writing of children in Grades 
IV to IX have been transferred to specially designed Hollerith cards. Each 
card carries a single sentence and each is coded to permit the identification 
of the sentence, the particular composition or letter from which it is taken, 
and the serial position of the sentence in the composition or letter. Later it 
is planned to code the same cards for other items such as verb usages, pro- 
noun usages, and sentence variety. The tabulation of these materials by 
sentences and with keys by means of which the original can be identified is 
especially important. Many of the studies made in these various fields are 
not easily repeated or are the data secured useful for purposes other than 
those for which the research was initially begun. 


Various Curriculum [Illustrations of Documentary Analysis 


The range of quantitative studies is extensive. Bobbitt (121), with the 
assistance of a number of students, carried on a significant quantitative 
analysis of written materials in order to discover data useful in curriculum 
construction. This included an analysis nd classification of all topics in 
the Reader’s Guide to Periodical Literature covering the years 1919-21, an 
activity which resulted in the classification of 11,000 different topics. The 
topics in each of these subordinate fields was then redivided and quantita- 
tive analyses were made of these items. Bobbitt made it clear that one can- 
not assume that there is a significant positive correlation between the fre- 
quency with which these items are mentioned and their importance in the 
curriculum. 

Similar analyses were made of 180 issues of the New York Times, 14,740 
columns of the Encyclopedia Brittanica, topics referred to by the ten thou- 
sand most frequently used English words, topics treated in the Literary 
Digest, and others. 

Pierce (139) made a study of the civic attitudes as disclosed by the 
analysis of American school textbooks. These books included 97 histories, 
67 books on civics and social and economic problems, 45 geographies, 1(9 
readers, 10 French books, 4 textbooks in Italian and 7 in Spanish, and 50 
music books. The books were surveyed to discover the civic attitudes pre- 
sented to people who read them. It should be noted that the significance 
of her data depends largely on the assumption that the textbook is one of 
the chief agencies of instruction in subjectmatter. 

Charters carried forward a number of curriculum studies making use 
of quantitative analyses. These include, among others, The Commonwealth 
Teacher-Training Study (126), his study of secretarial duties and traits 
(124), and his study of basic material for a pharmaceutical curriculum 
(125). Charters made clear in these studies that such findings are raw 
materials for curriculum construction and that evaluatory judgments must 
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be expressed in reference to them before they finally become a part of a 
curriculum. 

Caution—One must be continuously on guard against assuming that 
the frequency of appearance of an item, whether it is a grammatical error, 
the most frequent type of arithmetical operation, or the most commonly 
misspelled word, dictates clearly and unequivocally what the educational 
treatment of that situation shall be. To accept such a conclusion is to fall 
into a narrow, mechanistic concept of education. Objective data secured in 
the quantitative studies which have been described must in turn be quali- 
tatively and subjectively handled by teachers, textbook writers, and 
educators in order to determine the learning situation in which they are 


to be used. 


Indexing Difficulty and Trends 


Dale and Tyler (131) analyzed health materials to discover factors which 
might correlate with the reading difficulty of such materials as experienced 
by adults of limited reading ability. The number of hard technical words, 
hard nontechnical words, and the general complexity of sentences were 
among the three highest predictors of reading difficulty. Other analyses 
were made of length of sentences, parts of speech, types of pronouns, and 
beginning letters of words. Similar studies were made of reading materials 
by Gray and Leary (134). 

Brownell (122) studied the research on sixty-seven problems in arith- 
metic. By a system of tabular cross classification he showed the extent to 
which nineteen different research technics have been employed in solving 
these problems. One notes, for example, that the testing technic has been 
used 125 times in these studies, the laboratory technic only twenty-two times. 
Analyses of research such as these are extremely useful to workers in the 
field. 

Several chapters in the study, Recent Social Trends (140), illustrated 
the analysis of trends in social phenomena. We may cite the work of Willey 
and Rice (146) on changes in the amount and kind of communication, 
based on an analysis of materials handled by the post office and by tele- 
graph companies, and developments in the field of periodicals, newspapers, 
and radio programs. Hart (136), in discussing changes in popular interest 
and in certain attitudes, counted the number of articles which had appeared 
in certain classes of magazines indicating the views of authors and the 
supposed interest of the readers. 

In the field of analyzing moving picture films taken in the laboratory, 
we have yet another type of documentary analysis. In this instance the 
film is used primarily as a means of observation and recording; it is not 
an example of meaningful content of ordinary communicative documents. 
For the analysis of eye movements in reading, we may cite Tinker’s 
bibliography (143) of this field, and for a more recent use—the analysis 
of viewing art work—we may cite Buswell’s study (123). 
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CHAPTER V 


Direct Observation as a Research Method’ 


ARTHUR T. JERSILD and MARGARET F. MEIGS 


Dnuecr OBSERVATION is the oldest, and remains the commonest, instru- 
ment of scientific research. Its systematic use in research in child develop- 
ment and education has become especially prominent during the past 
fifteen years. Several factors have given impetus to its use, including the 
establishment of centers for research in child development; the demands 
of the “newer education”; a desire to probe aspects of behavior not acces- 
sible to the conventional paper and pencil, interview, or laboratory tech- 
nics; a desire to obviate some of the subjective errors likely to enter into 
the customary rating procedures; an emphasis on the need for studying 
children in “natural” situations, and for studying the functioning child, 
including his social and emotional behavior, rather than to rely exclu- 
sively on static measurements of mental and physical growth. 

The present report will deal primarily with applications of direct ob- 
servation procedures since about 1925-26, when Olson and Goodenough, 
shortly followed by Thomas and others, focused attention upon the method 
itself, its possibilities, and the procedures and safeguards necessary to 
pe establish it as a reliable scientific tool. For earlier reviews and discussions, 
a see Olson and Cunningham (190) ; Good, Barr, and Scates (163) ; Good- 

. enough and Anderson (165); Murphy and Murphy (185); Anderson 
(148) ; Bott (156); Arrington (151); and Symonds (197). 


f Development of the Method 


i Olson (187), in an investigation of nervous mannerisms of children, 
= | “attempted to apply the general principles of scientific measurement as 
evolved in biometric work to observations of behavior.” He applied what 
came to be known as the “time sampling technic.” The method was fur- 
ther explored by Goodenough (166, 167), and Parten (191); by Thomas 
and associates (199), Arrington (149); and by numerous subsequent 
investigators (154, 156, 170, 173, 175, 176, 178, 184, 186, 198). 

The details of procedure and the practical recommendations emerging 
from studies designed to develop the scientific integrity of direct observa- 
tion have varied, but certain requirements may be noted, including sys- 
tematic recording in objective terms of behavior in process of occurring, 
in a manner that will yield quantitative, individual scores according to 
procedures that involve the following safeguards: (a) definition, in terms 
of overt action, of the units or patterns or contexts of behavior that are 
recorded and scored; (b) measurement of the objectivity of definitions 
so used; (c) “control” of the observer; (d) measurement of the reliability 


1Bibliography for this chapter begins on page 597. 
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or fidelity of the observer; (e) appropriate timing and distribution of ob- 
servation periods; and (f) observation periods that are sufficient in 
duration and number to give a reliable sampling of the behavior with which 
the study purports to deal and for which quantitative results are reported. 

In many of the earlier studies, the units of behavior were defined in ad- 
vance, and the observer could then enter a check or symbol on the record 
each time an item of a given category occurred. Predetermined categories 
of this type have been utilized in a number of subsequent studies, and 
have been found adequate for many purposes.? Many recent studies, 
however, while retaining the measures of objectivity and reliability, have 
abandoned the use of fixed, predetermined categories in favor of a more 
fluid “running account” of what is happening. As indicated in a study 
by Jersild (176), the use of predetermined categories to probe restricted 
aspects of complex forms of behavior may fail to yield data that authenti- 
cally reflect what transpires. As has been indicated in several studies, a 
procedure obviously fails in achieving its purpose if objectivity, in a literal 
sense, and reliability, in the statistical sense, are gained by sacrificing more 
and more of the substance with which a study purports to deal (154, 175, 
176, 185, 186). Further, for many purposes, it is more important to con- 
sider given items of behavior in terms of their context and the pattern of 
which they are a part than to obtain simply an accumulation of isolated 
tallies (160, 177, 186). There are pitfalls in this direction, also. A pro- 
cedure so highly skeletonized that it yields only a few dry bones is no more 
to be avoided than a treatment which becomes so entangled in fine shades 
and nuances that nothing emerges save a lengthy and quite inconclusive 
volume of chitchat about each individual who has been observed. 


Situations to Which the Method Has Been Applied 


Direct observation has been used in studies of practically all aspects of 
the behavior of young children, and it has been quite widely applied in 
classrooms, camps, homes, discussion groups, playgrounds, museums, 
studies of the behavior of adults, and special situations (see preceding 
footnote) . 

The method of direct observation has been applied in situations that are 
“free” (with no restrictions other than those normally inhering in the 
situation itself); or manipulated (as when the investigator, in order to 
precipitate given types of behavior, injects a special or arresting factor) 
(186) ; or partially controlled (as when children are confined to a given 

2 To conserve space, illustrative materials and numerous references had to be omitted in the final draft of 
thie and other sections of the paper. An indication of some of the problems that have been studied by 
ptati of the hod of direct observation can be found in the bibliography, which exceeds the num- 
ber of ref ificall ti d in the text, but does not list several important studies, including 
several sumbers « of the University of Toronto Child Development Series; the University of lowa Studies in 
Child Welfare; and the University of Minnesota Institute of Child Welfare Monographs (in several of which 
effective use has been made of semicontrolled situations); and Child Development Monographs, Teachers 


College, Columbia University (including several studies utilizing predetermined categories and a later group 
utilizing the “running account”’ procedure in free or partially controlled situations). 
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room or situation or are confronted with certain conditions or directions, 
with freedom to respond to such conditions in their own way) (147, 160, 
172, 173, 182, 183, 184). Many studies have utilized both free and con- 
trolled situations (172, 173, 186). The method can also be used to study 
behavior appearing incidental to a situation that definitely confines the 
individual, for example, observations of accessory movements exhibited 
when a person takes a mental test (152). 


Pe) , 





Mechanics of Recording 


Technics of record-taking range from the use of standardized record 
sheets constituting, in effect, a checklist, and ruled off to accommodate pre- 
determined categories of behavior and units of time, to the use simply of 
a blank pad of paper on which the observer records as fast as he can 
as much as he can hear or see. There are many intermediate procedures: 
categories of behavior may be recorded (usually by code) in a running 
account; the checklist form may be combined with space for supple- 
mentary running comment; the “running account” or “diary record” type 
of recording usually will involve the use of many abbreviations and 
symbols, and it may combine the recording, by symbol, of certain pre- 
defined units of behavior along with a more fluid account of the context 
in which such units of behavior occur. 

Experiments have been made with mechanical aids in keeping account 
of the passage of time during the observation period (150, 196, 200, 202). 
In situations corresponding somewhat to those in which the method of 
direct observation applies, use has been made of mechanical aids in re- 
cording (153, 169). Much use has also been made of photography and 
motion pictures (162, 188). In many situations there are features of be- 
havior that cannot, with present equipment, adequately be recorded by 
means of motion pictures (by reason of problems of lighting, mobility, 
time, and the like). Sound films have been used as a means of studying 
the accuracy of observers (150, 198). 

Longhand and shorthand accounts have been compared with each other 
and with mechanical recordings (153, 199). For the purposes of a given 
study, the efficiency of the observer may depend less upon speed of writing 
or degree of fidelity in reproducing minute details than upon the worker’s 
“style of observation and the amount of selectivity” which he exercises 
(157). 








Definition of Units of Behavior and Scope of Observations 


In practically all situations it is necessary for the observer to be selec- 
tive; he cannot see and record everything. Whether the object is to obtain 
considerable detail on a limited aspect of behavior or to record the oc- 
currence of gross types or patterns of activity, it has been found neces- 
sary to plan what the emphasis will be, what especially will be noted, 
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what is to be ignored or simply to be noted in general terms. Unless the 
investigator is reapplying procedures developed in earlier studies, the 
usual practice has been to do a great deal of preliminary exploration. 

In any event the investigator finds it necessary, sooner or later in the 
course of the study, objectively and in some detail to define the units or 
categories of behavior in terms of which he proposes to produce quanti- 
tative data, and he must take steps to measure and to report the results 
of his measurements of the objectivity of his definitions. 

A frequently used test of the objectivity (and applicability) of a set of 
categories is to have two or more independent workers classify the contents 
of records of the behavior that is being studied. (In the exploratory stages 
of a study, a worker can measure his own consistency in classifying the 
same data at different times. If he cannot even agree with himself, it is ap- 
parent that his scheme needs revision.) Item by item comparisons of the 
analyses can then be made, and the agreement can be computed in terms 
of percents. In practice, this process may have to be repeated several times 
with progressive refinement of definitions, removal of ambiguities, and 
contraction or extension of the list of separate categories or behavior units. 
This procedure obviously does not demonstrate whether the categories 
eventually chosen are the best that might have been chosen, but if care- 
fully carried out it does mean that definitions are made explicit, that 
hidden or. subjective considerations are substantially eliminated, and that 
other investigators in the same field can either utilize the same definitions 
or demonstrate, objectively, what their defects may be. 

Among other matters that investigators have been called upon to define 
are the criteria as to what constitutes a new or discrete occurrence of the 
behavior in question. (This problem is obviated, to a great degree, when 
scores are computed in terms of time units, as shown below.) For example, 
opening the door for another and helping another to carry a load may each 
fall under the definition of cooperative behavior and receive a separate 
tally when occurring at different times; but if both occur together in a 
single episode, or if one of the acts is repeated in immediate succession, 
should one or two tallies be given? The answer may have to be somewhat 
arbitrary but it can be explicit. Similarly defined must be the ways in 
which the same event might be tallied under different headings or sub- 
headings; for example, Jack runs up and punches Jill’s nose—this is 
an instance of “aggression” and it represents also a “social contact,” not 
to mention many other things. 

The choice between predetermined categories and the “running account” 
type of record has been made, in part at least, on the basis of the investi- 
gator’s purpose. The former procedure has been found expedient when the 
intention is to obtain a quantitative survey of the frequencies of certain 
clearly definable and psychologically distinguishable forms of behavior. 
The disadvantage, of course, is that once the units have been decided upon, 
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the observer is not free to adapt his account to what he observes; rather, 
he must fit what he observes to his categories and this sometimes may mean 
that he is compelled to project his own definitions on the behavior that he 
sees. For varying viewpoints on this subject, see (151, 154, 157, 165, 177, 
185). In the running account, the worker, in devising his final scheme for 
treating the data, can capitalize on the contents of his records as well as 
upon incidental learnings during the course of his observations. Both pro- 
cedures involve the danger of shifts of emphasis during the course of the 
study, and neither is proof against the development of bias on the observer's 
part during the study (149, 157). 


Time Units 

The time unit within the observation period is especially important if 
it serves as a scoring unit, as shown below. Observers have used time units 
ranging from five seconds to several minutes. The necessities for refined 
time units will vary with the nature of the behavior that is being studied. 
A gross time unit may not be sufliciently discriminative, and on the other 
hand a highly refined unit may not only be unnecessary but meaningless, 
especially in the study of behavior that occurs relatively infrequently 
(166, 189), and it may interfere with accurate observing (151). Even 
when time units are not used as a basis of scoring, investigators have found 
it useful to have a record of the passage of time, say in units of thirty 
seconds, or one or more minutes; such timing is handy for orientation 
when comparisons are made between records of independent observers, for 
sectional treatment of the data in measurements of reliability, and for pos- 
sible supplementary analyses in the final treatment of the data. 


Quantitative Units Used in Scoring 


Scores may represent a tally of the frequency of a given response or 
defined unit of behavior during the entire time devoted to observation, for 
example, the number of different times a child recites (171, 178) or the 
number of different times a given language construction is used (183). 
Or they may represent the number of separate time units during which a 
given response has occurred; for instance, in a record which notes the 
passage of time in units of five, ten, thirty, or more seconds, a tally is 
given for each unit during part or all of which a child laughs, or is phys- 
ically active (149, 177, 187). For some purposes it is important to take 
account not only of the frequency of discrete responses but also of their 
extent or duration (for example, to distinguish between a lengthy recita- 
tion and a brief “yes” or “no”). 

In connection with some forms of behavior, it has been noted that gross 
frequencies alone may fail to tell the whole story. Further, it may be appro- 
priate, in interpreting the data, to take account of occurrences that may be 
revealing, even if rare. Likewise, relative frequencies may be as important 
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as absolute frequencies, for example, A exhibits 30 “social contacts” and 
10 “conflicts,” while B exhibits 60 “social contacts” and 15 “conflicts”; 
the difference between the two in number of “conflicts” may be much less 


significant than the difference in ratio of friendly to belligerent social 
contacts (177). 


Length of Separate Observation Periods 


Single observation periods have ranged from less than a minute to sev- 
eral hours, depending on the predilection of the investigator, the nature of 
the phenomena that are being observed, the convenience of all concerned, 
or practical restrictions and routines (such as length of the class period). 
In some situations the periods necessarily will be of variable length—time 
spent by observed child in taking a mental test, in dental chair during 
treatment, in taking his afternoon nap, or time taken by different children 
to exhibit a given number of responses when the sampling desired repre- 
sents the number of responses rather than a constant time interval. In the 
study of the behavior of young children, especially in “free” situations, a 
series of many short, rotated periods (ranging from five to fifteen minutes) 
may insure a representative sampling better than a smaller number of 
longer periods. The optimum interval is likely to vary with different types 
of performance (166). However, a short look-and-run observation may 
come upon a sequence of activities that already is under way, forsake it 
before it is concluded, and thus miss important features. Decisions here, 
as in connection with other features, will usually represent a compromise 
between the ideal and the practical. In planning the length of the observa- 
tion period it is proper, among other things, to provide as far as possible 
for breaks or rest periods, since the job of observing often is quite ex- 
hausting. 


Measurements of the Reliability of the Observer 


To meet the requirement that for each feature that is treated quantitatively 
in the statement of results there must be a measurement of the observer’s 
accuracy or fidelity in noting and recording this feature, a common proced- 
ure has been to compare, item by item, the records obtained when two or 
more independent workers simultaneously but independently observe and 
record the same behavior. The agreement can be computed in terms of per- 
cents.’ Where feasible, the observer can similarly be checked against a 
mechanical recording of samplings of the behavior that is being studied 
(150, 153, 198). When predetermined categories are used, measurements of 
‘Sas! Eenteaine have been used: (a) items on which observers agree divided by this value plus 
items of disagreement (including items dissimilarly recorded and items noted by one observer and omitted 
by the other); (b) items in each observer's record that agree with the other's (in effect, doubling the 
agreements) divided by this total plus disagreements. For a discussion of the reasons underlying the latter, 
see Arrington (149). For illustrative findings regarding observer reliability see (149, 157, 167, 170, 175, 187, 
191, 193). Within a given study, covering several units of behavior, observer agreement on the different units 


may vary considerably. The lower the agreement, obviously, the less confidently can the investigator report 
quantitative findings for the behavior in question. 
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observer agreement and agreement on classification are, in part, tele- lai 
scoped (151). re 
The measurement of observer reliability is somewhat simpler when re 
observations are recorded in terms of a predetermined list of categories in 
than when a “running account” is made. However, even when the latter ot 
scheme is used, there must be an understanding, before the observer begins in 
to take records that represent the main data of the study, as to the points of as 
emphasis in the observations. Item by item comparisons can be made of the 
specific content of the records of independent observers, even though the aj 
investigator may not yet have reached a decision as to how given items ti 
of behavior, reliably recorded as far as can be determined by measure- oO 


ment of agreement between independent observers, are going to be classi- 
fied and interpreted in the final treatment of the data (171).* The final 
scheme of classification can, in turn, itself be tested for objectivity. 

The correlation method has also been applied to measure agreement be- 
tween the records obtained by independent workers, in simultaneous or non- 
simultaneous observations. For certain purposes this procedure suffices, 
although it may not give a precise measure of agreement; for example, 
in an extreme case observers A and B both record five items for Jack and 
ten for Jill, while a perfect reproduction would yield respective scores of 
ten and twenty; A may have noted items entirely missed by B, and B noted 
items entirely missed by A; thus, there may be a perfect correlation between 
gross scores but zero agreement on actual items. 


Size and Reliability of the Sampling 


The extent to which the data represent a reliable sampling has custom- 
arily been measured by means of correlating equal divisions of the 
data (for example, scores on even versus odd observation periods, or 
on “direct” versus “indirect” records, and the like). For rough purposes, 
especially when observations are made of groups rather than individuals, 
it is possible also to measure the consistency and extent of differences 
between group means in successive samples. 

By reason of the ease of scoring when predetermined categories are 
used, such measurements can often quite expeditiously be made during 
interim stages to determine the reliability of the sampling to date. Even 
if the classification scheme is not determined in advance, it has been found 
to be feasible to make such interim measurements in terms of tentative 
categories. If not feasible, the investigator will not necessarily be com- 
pletely in the dark. He can be guided to some extent by other studies, by 
rough interim inspections, and he can veer in the direction of taking a 

*E. g., both observers record that while John, during class discussion, is describing the route of an 
airplane flight to Europe, Jim interjects, “Didn't they pass over Newfoundland?’’ There is measurable 
agreement here, even though the a may not yet have determined under which one of ~ean 


categories (such as “critical comment”; “interrupts”; “irrelevant remark’: “‘vol } 


information or question’’) Jim’s sespouse will be tallied, in the light of its context and what has ane 
before, in the final tabulation of the data. 
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larger sampling than may be needed; if in the end it proves that he has 
reliable data without using a given portion of his records, he can lay such 
records aside. In any event, investigations that have been made to date 
indicate that some features in the study are likely to be more reliable than 
others. Accordingly, he can circumscribe his final quantitative treatment 
in terms of his reliability coefficients and still use the remaining findings 
as an aid in interpretation, or announce them as tentative. 

At the present time it is impossible to give a definite estimate as to 
approximately how much time the observer should spend in a given situa- 
tion with a given problem to obtain a reliable sampling. Numerous studies 
of the social and physical activities of preschool children have utilized 
a total of from 100 to 150 minutes, in repeated samplings of from five to 
fifteen minutes. This amount cannot, however, be recommended as standard. 
Published studies show wide ranges of reliability coefficients on the basis 
of from about forty minutes to many hours (or even days) of observation 
of different features of behavior. Within a given study the reliability co- 
efficients are likely to be higher for some categories or units of behavior 
than for others (149, 175, 187). Likewise, a given sampling within the same 
general situation may be more reliable at one age level than at another. 
The question as to the minimum acceptable reliability coefficients for 
scientific use of observational data has not been treated systematically. Need- 
less to say, high reliabilities are desirable when the data deal with phe- 
nomena that exhibit individual differences (as most things do). Many 
published observational data do have high reliabilities, although investi- 
gators have frequently been content with lower reliabilities than custom- 
arily are demanded in connection with standardized tests or laboratory 
measurements. Under many circumstances a well-executed observational 
study with ostensibly low reliability may possibly give more authentic 
results than does a standardized test that is more reliable in the usual 
statistical sense, but is less realistically adapted to the phenomena which 
it purports to appraise. The doughty observer, however, will not bank on 
this smug generalization. 

Generally speaking, reliability is likely to rise with an increase in the 
size of the sampling, but there are limits to which the investigator can take 
advantage of this tendency, as when he is studying behavior under condi- 
tions that by their very nature are not constant, for example children’s 
initial adjustments in a new situation, or behavior that normally changes 
with age but might change at different rates in different children. 

Since the behavior of individuals may vary considerably in different 
situations, an adequate study of the behavior in question may require 
observation of the same individuals “under such a diversity of circum- 
stances as to constitute a representative sampling of the child’s daily life” 
(166). Failing this, the investigator obviously can indicate that indi- 
vidual scores hold true only within a limited set of conditions. 
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Position of the Observer 


The observer obviously must try to maintain a position that enables 
him to see or hear the items of behavior which he purports to measure: 
if there are restrictions on his mobility, where it is necessary to move 
about, or if a required stationary position renders some objects or types 
of behavior less accessible than others, he perforce must revise the objec- 
tives or scope of his study accordingly. One vantage point may be more 
favorable than another (187). This factor may complicate the meas- 
urement of observer reliability. If the observer desires only a rough measure 
of certain characteristics or outstanding episodes, he can station himself 
at a strategic point and note such happenings as flow before him, without 
systematic attention to each member of the group. 


Effect of the Observer 


The presence of an observer might be expected to produce self-conscious- 
ness or other reactions that would distort the behavior which is being 
studied, but this factor has usually been found to be less serious than might 
be anticipated. Observers repeatedly report that much-observed children, 
as well as adults, seem quite readily to become habituated to the presence 
of an outsider. The observer normally is careful not to participate in the 
activities he is observing unless in so doing he deliberately is introducing 
a factor pertinent to his study. Instances have, however, been reported in 
which children’s behavior seems to have been influenced, at least for a 
time, by the observer’s presence (178), and a teacher or parent whose 
practices are being observed would be less than human if he were not 
somewhat affected, although the passage of time here also has a tranquiliz- 
ing effect. Whatever the observer’s effect may be, it is not likely to be so 
pronounced in the long run that the records fail to show individual dif- 
ferences. As time passes habitual practices and interactions between in- 
dividuals in the group come to the fore, and even the first of a series of 
records may reveal large individual differences in types of behavior that 
the observed individual might especially desire to display or to conceal, 
which prove in later observations to be characteristic. One-way vision 
screens or windows may be used for some purposes (147, 168, 172), but 
in many circumstances it would not be feasible to conceal the observer. 
The eavesdropping technic can be duplicated to some degree in “free” 
situations when the observer or auditor records what transpires while 
ostensibly he is uninterested or preoccupied with other matters (158, 181). 


Scope of Observations 


In a majority of published studies the observer has followed the be- 
havior of one child at a time, but under properly defined conditions he 
may observe an entire group at a time. In observations of a group data 


may be obtained which show relatively high reliability in indicating group 
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trends—in gross comparisons, for instance, between different classes 
studied with respect to certain performances (205) —while failing to yield 
reliable data for individual children (174). On the other hand, under 
proper restrictions, observations devoted to an entire group may yield 
reliable data for individual subjects; for example, in a study of the fre- 
quency and quality of pupil contributions and of the interactions between 
pupils and teacher, the observer may definitely center his attention only 
upon certain definable items that flow before the class as a whole when 
the group is working on a common project (178). 

It might seem that individual observations require an undue amount 
of time in return for the results obtained. Future studies will no doubt 
show ways in which the advantages supposedly inhering in an observa- 
tional record might be approximated by means of short-cuts. At the pres- 
ent time, however, the use of other methods in certain areas of behavior 
would not mean that the investigator is covering the same ground more 
quickly; it would mean simply that he was obtaining some facts and for- 
feiting others. It does not follow, of course, that the returns from observa- 
tions have always been commensurate with the time and labor devoted to 
them. 

In a well-planned study, data actually are accumulating faster than 
would appear. While observing one child at a time the observer also is 
obtaining an added accumulation of “indirect” data on other children if 
his record includes, as it often should, an account of the names and be- 
havior of other subjects with whom the individual comes into contact or 
who influence his behavior in observable ways. More important is the fact 
that an active observer can accumulate an enormous amount of material 
in a relatively short time; for purposes of intensive study the data ac- 
cumulate just as rapidly by means of direct observation as by practically 
any other means. Indeed, if the investigator is trying to get at the under- 
lying elements of the behavior processes in question the time spent in ob- 
taining the original records may represent only a small fraction of the 
amount of time that subsequently must be spent in analysis of the records. 
If the investigator is interested in only a limited aspect of the behavior 
which is recorded, the same data (if the original observations were care- 
fully executed) may be used by himself or others for other types of treat- 
ment. Observational records of children’s language (159) and the setting 
in which the language was used have been utilized by independent work- 
ers in various studies of language, as such, as well as in studies of re- 
sistance, aggression, imaginative behavior, and sympathy. Again, a series 
of “diary records” supplied data for a study of children’s conflicts, and 
for another study of their make-believe activities; and the same raw mate- 
rials might have been used for other purposes (175). 

The direct observation technics abound in opportunities for cooperative 
research. Two persons may observe simultaneously, each concentrating on 
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different aspects of behavior, and subsequently pool their data (149). In 
a majority of studies the investigator and his associates participate directly 
in the gathering of the data, but data that might otherwise be inaccessible 
have also been gathered through the help of cooperating teachers and 
parents (164, 173); naturally, the more the job of observing and record. 
ing is delegated to others, the less rigid can be the control of the ob- 
servation and the less can be known as to the observer’s fidelity and the 
reliability of the sampling. 


Possible Future Uses and Adaptations 


Investigations to date have made only a small start in exploring the uses 
and adaptations of methods of direct observation. The writer does not 
pretend to be a prophet, but his guess is that among other developments 
which the future will bring are the following: considerably more use of 
the method in connection with the planning and appraisal of the curriculum, 
including, among other matters, the use of a series of observations in class- 
rooms to provide leads as well as specific content in the construction of 
tests for the measurement of children’s learning, concept and attitude for- 
mation, and other outcomes of given educational projects or tegimes; more 
use of the method in connection with problems of the rearing of children 
at home (with the possible results, among other things, that there will be 
a better differentiation between problems that are a normal and perhaps 
salutary feature of development as distinguished from problems that re- 
quire help) ; more use of the method in the study of theories and prac- 
tices in mental hygiene and psychiatry, on both the diagnostic and therapeu- 
tic sides (with an eye to the possibility of more attention to overt action, 
and with the possible result that a more systematic and objective terminology 
may be evolved); more information, as investigations accumulate, as to 
how authentic data might be obtained through less time-consuming methods 
than now are required, including possibly the better utilization of partially 
controlled situations, of supplementary test and interview technics, and 
the use of procedures that precipitate the behavior process that is being 
studied, such as the “projective” method, the “play technic,” and the like; 
more information as to how units or patterns of behavior may be defined 
in terms of “psychological” units or patterns; and the possible discovery, 
in given areas, of limited features of behaviors that may serve as an index 
of larger aspects. 





CHAPTER VI 
The Case Method’ 


WILLARD C. OLSON 


Prostems in human relations and in institutional management are usually 
presented to professional people in case form, and the published litera- 
ture reflects the search for practical answers rather than scientific generali- 
zations. The latter, however, may develop naturally out of case materials, 
as will be explained later. A case study may involve the use of any of the 
special methods discussed in the present issue of the Review. The unique 
featur’ f the case method are the synthesis of many types of data, in- 
terpretation in accordance with known principles as modified by the 
dynamics of interrelationships, and generalizations in terms of wholes 
rather than parts. 

The present summary includes reports during the period 1930-39 which 
illustrate contributions of case method to the advancement of technical 
knowledge and excludes those with the immediate objective of practical 
help in a situation. In the three decades prior to the opening of the period 
of this review, the case method accounted for but 2 to 3 percent of articles 
and studies in education (229, 233). There was frequently a tendency to 
regard it as a temporary expedient, an exploratory device, or as a means 
of organizing data collected by other methods (211). No comparable sum-. 
mary of the use of case methods for the last decade has been located. Greater 
attention to personal and social development has given the case study 
of the individual a new dignity in education. 

The unique importance of the method has been well stated in the follow- 
ing quotation from Allport (208: 390) : 

The method . . . is the most comprehensive of all, and lies closest to the initial 
starting point of common sense. It provides a framework within which the psychologist 
can place all his observations gathered by other methods; it is his final affirmation 
of the individuality and uniqueness of every personality. . . . Unskilfully used, it 


becomes a meaningless chronology, or a confusion of fact and fiction, of guesswork and 
misinterpretation. Properly used it is the most revealing method of all. 


In presenting the material the writer has devoted a brief section to gen- 
eral descriptions of the case method and has grouped the remaining studies 
in terms of the services performed by the method in the establishment of 
professional practices and scientific generalizations. 


General Descriptions and Evaluations of Case Methods 


Textbooks in educational research usually devote a section to the case 
method. Good, Barr, and Scates (234) gave a fairly extensive bibliography 
for the beginning student of educational research. Culver (220) included 


‘Bibliography for this chapter begins on page 599. 
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fifty references on case method in her bibliography of 1,509 items on 
methodology in the social sciences. The field of personnel relies heavily 
on case technics (268). The retent comprehensive outline by Preu (251) 
was intended primarily for psychiatrists but utilized the data of other 
specialists. The advocacy of comprehensive cumulative records, amenable 
to case study for all children in schools, is becoming increasingly preva. 
lent (235, 265, 271). 

Interviewing is usually but one of the means of securing data for case 
studies. In psychoanalysis and the Rohrschach method, however, the 
results of the specialized interview have tended also to constitute the case 
study. Psychoanalytic technic stemming from the late Sigmund Freud 
has, of course, occupied a most dramatic position in case method as applied 
to problems of human living. Currently, the Rohrschach, essentially an 
individual method for eliciting responses to stimulus figures, is receiving 
intensive use and scrutiny. The technic has been both denounced and 
praised for its subjectivity. The supporters stress its use as an art and the 
desire for trained interpretation and standardization has led to the 
creation of a Rohrschach Institute. The reader to whom this technic is 
novel may wish to examine Beck’s monograph (213). Independent 
analyses of the responses of a given subject by several persons serve to 
describe the reliability of the technic (237). Frank (228) recently discussed 
the unique contribution of projective methods with an extensive bibli- 
ography illustrating the wide variety of situations which may be used for 
eliciting personal data. On the whole, there is a trend toward eclecticism 
well illustrated in a publication of the staff of the Institute for Juvenile 
Research (239) and in most recent textbooks and manuals. 

The difficulties in securing scientific generalizations from case materials 
need not be discussed here. Olson (250) stated that “from the point of 
view of prediction and control of the growth and behavior of an individual 
the case study is the most scientific method now known. Reasonably large 
and representative samples of the population from which data are gath- 
ered in a definite manner are recognized as necessary conditions for the 
extension of generalizations to groups.” 

A. E. Wood (270) pointed out that the statistical units extracted from 
records may be criticized as having little meaning apart from context; 
that is, the significance of a factor does not lie within itself alone but in 
its relationship to component elements. The factor is thus robbed of some 
of its unique meaning when divorced from context for statistical analysis. 
In spite of this limitation, however, some statistical analyses of importance 
have been made as will be seen subsequently. 

Dollard (223) made a critical examination of the life history as a sci- 
entific method. In order to use the life history for the establishment of 
generalizations, he stressed the fact that the subject must be viewed as a 
specimen in a cultural series, that biological factors should be socially 
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relevant, and that the role of the family in transmitting the culture and 
the continuous related character of development should be recognized. 
He evaluated a number of published life histories in terms of these criteria. 
The literature on description of case method and cases is voluminous, 

but there is relatively little available on the unique contribution of the 
method to general knowledge. Selected studies have been grouped under 
the following heads to indicate the types of contribution that have been 
made: 

Statistical Summaries for Administrative Information 

Evaluation of Programs 

Social and Institutional Patterns 

Curriculum and Instruction 

Illustration and Validation of Statistical Results 

Establishment of Scientific Generalizations. 


Statistical Summaries for Administrative Information 


When professional persons wish to communicate to others the nature of 
the problems with which they are dealing, it is a frequent practice to 
tabulate cases under significant categories. These tabulations serve to 
define policies and practices. Thus, Bassett (212) analyzed 523 cases 
brought to the clinic at Vineland during a period of five years. The cases 
were tabulated to show their origin in home, school, and community, and 
were distributed by age and mentality in relation to the incidence of in- 
corrigibility, sex delinquency, and physical defects. Rosenheim (253) 
supplied a report on the types of cases referred to the child guidance clinic 
of a state hospital. Fenton and Wallace (226) classified cases according to 
source of reference, age, sex, race, problem, intelligence, school grade, eco- 
nomic status, and presented interrelations between some of the factors. The 
function of a family agency as defined by its clients was reported by Shul- 
man (256). Kawin (241) analyzed 100 children studied by the clinic of the 
preschool department of the Illinois Institute for Juvenile Research. Her 
description also led to the generalization that about 12 percent of a 
group of unselected children seem to need special clinic study and treat- 
ment. 


Evaluation of Programs 


The evaluation of programs has frequently rested upon the subsequent 
history of persons affected by them. Thus, when Berk, Lane, and Tandy 
(214) wished to follow up thirty habit clinic children who manifested 
delinquency problems before the age of ten years, they secured parent, 
teacher, hospital, and agency judgments on improvement and noted that 
the frequency of problems was reduced by about one-half. The improve- 
ment was greatest in the case of a cooperative home and agency and in 
those children of normal or superior intelligence. Glueck and Glueck (232) 
wished to gauge the effectiveness of a juvenile court that had some clinical 
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assistance. Their analysis of 1,000 delinquent boys revealed that about 
88 percent continued offenses after the period of treatment. Among 93 closed 
case records of a research and guidance bureau of a public school system, 
Bodin (217) found that about 93 percent of boys and girls who had 
reached the ages of twenty-one and eighteen years, respectively, had be- 
come delinquent or criminal. 

Lowenstein and Svendsen (244) evaluated the results of eight weeks of 
farm-camp treatment for thirteen girls and boys characterized as shy or 
withdrawn by a follow-up study among case workers, parents, and foster 
parents. Kelly (242) summarized the results with the first 800 cases 
admitted during two years at a psychological clinic, relating the degree 
of success to the specific recommendations made. The treatment results of 
four guidance clinics were summarized and compared by Witmer (269). 
The effectiveness of provisions for the mentally retarded has been studied 
by case records of subsequent histories (210). 

Most follow-up studies utilize some feature of case methods, since the 
populations can no longer be found in convenient statistical groupings. 


Social and Institutional Patterns 


The study of group patterns existing in families, classes, schools, and 
communities is of large immediate and potential interest to students of 
educational research. Many studies of this type are included in a new 
journal Sociometry (259) and the reviewer will not attempt separate 
citations. An application has recently been made and reported at length 
in connection with a nursery school setting (246). With the tendency to 
define the task of the teacher as primarily lying in the field of inter- 
personal relations, this type of inquiry should be cama fruitful. 


Curriculum and Instruction 


Most writers stress the point of view that the unique feature of case 
methods is the interpretation of a variety of data collected by many 
technics. While interpretation rests frequently on bodies of information 
obtained by other than case methods, the unique aspects of synthesis are 
commonly taught and the interpretations of the sophisticated person are 
the only ones usually credited with validity. 

Sperle (260) reviewed the place of the case method in the curriculum of 
law, medicine, sociology, and psychology as an introduction to a de- 
scription of its use in the preparation of teachers in the New Jersey 
State Teachers College. The technic of collecting case materials and evalu- 
ations by students and faculty are included in her report. 

In a textbook describing remedial and corrective instruction in reading, 
McCallister (245) devoted several chapters to the presentation of type 
cases to advance the skill of students in dealing with problems in this 
field. For instruction in the general field of educational guidance, Smithies 
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(258) earlier compiled a book of case studies of normal adolescent girls. 
Wallin (266) contributed a case book of classified autobiographies for 
instructional purposes. 

Child Guidance Cases, edited by Sayles (254), presented cases for 
teaching purposes, illustrating the combined physical, psychological, social, 
and psychiatric approach. The material was intended primarily for the 
preliminary instruction of the psychiatric social worker. Sibley and Stodg- 
dill (257) advocated the abstracting of clinical reeducation technics as 
a method of training beginners in clinical psychology. Numerous guides 
and problem books in education adopt a case approach. Further study 
could profitably be made of the use of case materials in curriculum and 
instruction. 


Illustration and Validation of Statistical Results 


It is common practice to illustrate statistical findings by rather com- 
plete reports on individual cases. For example Newman, Freeman, and 
Holzinger (248) gave their statistical findings on twin resemblance added 
human interest and a qualitative matrix by a detailed presentation of his- 
tories. A similar technic was used by Terman and Cox (264) in a study 
of masculinity and femininity. Throughout a statistical and measurement 
study of failures in college, Heaton and Weedon (236) gave illustra- 
tive material gleaned from the case records of the students studied. The 
reader who is untrained in statistical and experimental study is per- 
mitted to become more intimately acquainted with the material by using 
the case method as an alternative mode of presentation. 

The items of personality, mental, and aptitude schedules have usually 
had their origin in the direct observation and study of cases. A common 
technic in the subsequent validation of psychometric methods has been 
the contrast of clinically selected cases with normal individuals. Con- 
trasts of this type are too numerous and readily available to need separate 
citation and are properly discussed under other technics. A few illustra- 
tions of the role of case studies in relation to measurement will suffice. 

Fitz-Simons (227) compiled items relating to parent-child relationships 
from clinical and other sources and produced a checklist on the basis of 
evaluation by a jury of clinicians. The device was then reapplied to case 
material for the delineation of parent-child relationships with substantial 
agreement among independent appraisals. 

Studies by Hill (238) and Baker and Traphagen (209) reported success 
in weighting items of case histories to produce quantitative scores which 
differentiate between problem and nonproblem groups. Weisman (267) 
studied the validity of predictions based on mental tests by treating each 
of thirty students in a high school as a separate case and following the 
high-school record after testing. The case approach in this study threw into 
relief the factors affecting performance in an individual manner which 
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would be effectively concealed by a coefficient of correlation between in- 
telligence and achievement. In the field of personality study, Rogers (252) 
was able to demonstrate that his test for personal adjustment correlated 
significantly with the appraisals of case material by groups of clinicians. 
Furfey (230) used sixty-four brief case studies grouped by age levels from 
six to sixteen to illustrate and validate the concept of developmental age 
as measured by an interest questionnaire. 


Scientific Generalizations 


In medical practice the accumulation of published reports of cases of 
particular types has resulted in a body of knowledge leading to ex- 
panding stages of generalization. Something similar is probably happening 
in certain of the social, psychological, and educational fields. 

Perhaps one of the best examples is the series of publications by Shaw 
and his associates (255) in the investigation of delinquency in Chicago. 
His recent report on “Brothers in Crime” continued the case methods of 
“The Jack Roller” and “The Natural History of a Delinquent Career” in 
a report of a fifteen-year study of five brothers with rather complete 
official, clinical, and autobiographical records. The report stressed the 
process by which the delinquent careers of the brothers had their origin 
and development and the limitations of usual methods of treatment in de- 
flecting the course of a delinquent career. This series of studies leads to 
cumulative generalizations concerning the relative role of personality and 
the culture. Stressing more largely the personality approach, the auto- 
biographies of child development by Gesell (231) illustrated what can be 
accomplished by a clinical case method following children through time. 
The publication of longitudinal individual data from the Harvard Growth 
Study (221) and current analyses from the Child Development Labora- 
tories of the University of Michigan (249) and other centers indicated 
further possibilities in the analysis of material in terms of total indi- 
vidual pictures. 

Prior to the period of this review, Terman and his associates had 
made extensive use of biographical case material and records in the in- 
vestigation of genius. The last volume of the three, Promise of Youth, again 
utilized case material as a fundamental method (263). 

Biihler (218) compiled biographies, diaries, correspondence, and rec- 
ords of achievement of outstanding American and European persons of 
the past two centuries. With this material as a basis she classified the life 
periods into describable epochs of growth and decline in biological and 
psychological functions... 

By machine analysis of an accumulation of clinical cases, Ackerson 
(206) was able to make an extensive description of the stream of human 
material passing through the Institute of Juvenile Research. In subsequent 
studies he has developed technics of cluster analysis illustrating grouping 
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of symptoms for significant clinical entities. Thus, with Jenkins (240) he 
took fifty-seven cases with a diagnosis of encephalitis from a group of 
5,000 and correlated 200 descriptive items with the diagnosis. This technic 
yielded change of personality as one of the most frequent and highly 
correlated symptoms, with listlessness, quarrelsomeness, irritability, and 
other characteristics in decreasing order. Similar correlations were worked 
out for physical conditions, school records, home conditions, sex prob- 
lems, and eating habits. 

The recent work of Murray and his associates (247) at Harvard is 
promising for important generalizations from the synthesis of data from 
many sided case studies of college men. 

In determining the relationship between crime and psychopathy, Erick- 
son (224) examined the records of 1,262 male patients in mental hos- 
pitals and found that 25 percent had a history of criminality—10 percent 
before onset of mental disorder and 12 percent afterward. 

Symonds (262) obtained thirty-two pairs of case studies from former 
students. Each pair dealt with one child that had been accepted by the 
parents and one child that had been rejected. The generalization is reached 
that the accepted child has every chance to develop into a well-balanced, 
emotionally stable adult, while the rejected child is destined, on the aver- 
age, to show strong aggressive traits, to be hostile and antagonistic toward 
those with whom he must have dealings, and to develop tendencies which 
may lead to delinquency. 

Albright and Gambrell (207) found in the personality traits of case 
studies of adolescents significant material for the prognosis of the success 
of psychiatric treatment. Seeking an answer as to why children discontinue 
child guidance treatment, Feldman (225) analyzed and discussed such 
factors as age, source of referral to the agency, patient’s personality traits, 
patient’s attitude toward treatment, parental attitude toward treatment, 
parental personality traits, and reasons for children discontinuing treat- 
ment as seen by the case workers. 

The more unique an event the greater the demand for a case approach. 
Thus, research of the Dionne quintuplets is proceeding by individual study 
of each of the girls as well as by what might be termed a case study of the 
setting for the group as a whole (215, 216). The data obtained at the 
birth of these children were highly significant from the biological point 
of view. The growth data and careful observation of the effects of train- 
ing present a rather crucial test of the extent to which the environment 
through time will alter the fundamental pattern or status of the five 
children. 

The pages of the current number of the Review could easily be filled by 
a discussion of the numerous studies reporting interrelationship and 


trends in the field of clinical study. 
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In closing it should be noted that the case method is practically manda. 
tory for the student interested in process rather than product. A test will 
measure the number of arithmetic problems solved correctly or incor. 
rectly, but it requires direct observation by the investigator and verbaliza- 
tion on the part of the child to note the steps that lead to incorrect solu- 
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H tions (219). It is one thing to measure information in American history 
and another to describe the nature of the difficulties encountered in secur- 
: ing comprehension (222). Schedules will measure extroversion-intro- 
; version, ascendance-submission, and other personality qualities, but ob- 


servation of cases in settings are needed to delineate the process of per- 
sonal-social interactions (243). 


Summary 


The review of case methods indicates clearly that practical necessity has 
led to their development, since many of the problems of human living are 
presented in case form. The best study of cases requires mastery of prac- 
tically every technic known for the study of the individual and society. 
Reliable and valid work is exacting in its requirements for training and 
insight. Case methods present a severe test of the maturity of the art and 
science of human relations. The contribution of the case approach to ad- 
ministration, evaluation, analysis of social and institutional patterns, 
curriculum and instruction, the illustration and validation of statistical 
approaches, and the establishment of scientific generalizations indicates 
substantial progress in the past decade. 
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CHAPTER VII 
Genetic Method ' 


KAI JENSEN 


Introduction 


T HE GENETIC METHOD is used in long-time investigations to discover and 
study the origin, trend, rate, direction, and pattern of development in the 
phenomena studied. It is often thought of as pertaining primarily to phys- 
ical and anatomical growth but actually it is being used increasingly for 
the study of mental, social, and personality characteristics as well. The 
genetic method cannot be separated with precision from the historical or 
case history method, but it differs from the historical method in that it is 
more concerned with current developmental sequences, and it differs 
from the case history method in that it studies normal as well as atypical 
phenomena and it is not so likely to fail to consider the negative instance. 
As a problem-solving mode of attack it makes use of many other methods, 
such as the normative-survey, the experiment, the causal-comparative, the 
observational, the rating, or any other which will help it attain its ultimate 
objective of the prediction and control of individual development. 

The longitudinal or seriatim procedure is often identified with the 
‘genetic method and contrasted with the cross-sectional approach. Actually, 
the genetic method encompasses both. Developmental sequences may be 
secured by either method although it is obvious that the procedure which 
utilizes successive measurements on the same individuals may give a more 
accurate picture of developmental sequences than does the cross-sectional 
attack. The essence of the genetic method is that it does not study any 
given cross section of growth or development, or level of behavior, for its 
own sake but seeks to discover developmental principles represented by a 
particular cross section in relation to other cross sections. Critical de- 
scriptions and evaluations of the genetic method will be found in Good, 
Barr, and Scates (310) and in Munn (350). 


Problems Attacked by the Genetic Method 


Any method is valuable and vital in proportion to the extent to which it 
yields solutions for significant problems. It is relatively weak or useless if 
it either attacks problems of little import or fails to yield needed answers. 
One field in which the genetic method has been used extensively is in the 
securing of growth and developmental norms of all sorts. At present its 
use is being greatly extended in the field of research which seeks to dis- 
cover and evaluate the factors influencing various aspects of growth and 
development. Relatively unexplored fields for which the genetic method 
is peculiarly suited are the study of experimental modifications or altera- 
tions of behavior, and the investigation of origins and causes of behavior. 

1Bibliography for this chapter begins on page 602. 
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A long list of problems urgently in need of attack by the controlled longi- 
tudinal approach was published by the National Society for the Study of 
Education (322). 


Normative Developmental Studies 


Many studies of the various component parts of the complex growth 
process have been completed. Some of these studies have been made in 
terms of developmental level while others have been done in terms of 
particular segments of behavior. Research dealing with developmental 
sequences in the prenatal period has been excellently reviewed and evalu- 
ated by Carmichael (287). Children after birth have been studied from 
many angles, and developmental sequences in the physical, motor, physio- 
logical, mental, and social realms have been published (279, 284, 285, 286, 
293, 298, 303, 305, 309, 330, 347, 359, 374, 396, 397). In addition, de- 
velopmental studies of thinking and reasoning (282, 289, 296, 329, 342, 
353, 360, 363, 366, 388), aggressiveness (274), sympathy (351), soci- 
ability (327), dressing behavior (275, 333), fears, (325, 328), eating 
behavior (307, 308, 319), sleeping behavior (305, 306), and language 
(283, 317, 324, 339) have also appeared in the literature. Lindsey (336, 
337) and Smith (383), have published developmental studies of the human 
electroencephalogram. 


Critical Evaluation of Normative Studies 


The above research dealing with developmental sequences has been 
carefully done and the norms secured are probably reasonably accurate 
for the culture pattern in which they were secured. These norms are of 
particular value in dealing with groups and may also be very useful in 
helping set the stage for optimum learning by helping evaluate the maturity 
level of the child (352). It must, however, be definitely recalled that these 
norms, in whatever field they may be, are affected by many factors. Thus 
cerebral birth injury (297, 356), prematurity (372, 373), restricted or 
accelerated practice (294, 295, 326, 340, 341), conditioning (318), praise 
and competition (332, 358, 398), home conditions (278, 301, 314, 315, 
349, 359, 382, 386, 387, 391), conflicts (273, 299, 313), and maturity 
levels (352), singly or in combination, may markedly alter the ascertained 
behavior levels in a particular child or in groups of children. Conse- 
quently, any norms on child growth and development, no matter what 
aspects may be involved, are not final but are the end product of a par- 
ticular set of circumstances which, if altered, may produce entirely dif- 
ferent results. Too much research in this field has been purely descriptive 
in character and concerned with the mere determination of the child’s 
present status, whereas the real problem should deal with the ascertainment 
of the specific factors which have produced the particular results at hand 
and which will modify future development. 
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igi- Genetic Studies Which Deal with Various Gross Environmental 
of Factors Which Influence Growth and Development 


Some striking studies dealing with the effect of various gross environ- 
mental factors, such as socio-economic background, nursery schools, or- 
phanages, and foster homes, on the mentality of the child have been made 


wth (280, 335, 357, 371, 377, 378, 379, 380, 381, 384, 393, 394, 395, 396). 
in These studies seem to indicate that enriched environment may produce a 
of marked change in the intelligence test score of the youngster under con- 
ital sideration and that these gains tend to persist. All the experimenters are 
ilu- not in agreement and decidedly conflicting interpretations have been of- 
om fered. Whether or not there has been a change in the central nervous 
si0- system of the child or whether the experiments show the unreliability of 
86, intelligence tests at the preschool level or accurately measure spasmodic 
de- and fluctuating growth can only be settled by the development of new ex- 
42, perimental technics and further experiments. 
aci- One theoretical point needs to be mentioned. There is a commonly 
ing accepted belief that if a trait is hereditary nothing can be done about it. 
age This point of view arises from a misunderstanding of the nature of ex- 
36, periments in genetics. In this field environmental factors are presumed 
lan to be held constant so that differential manifestations of the germ plasm 


may be studied. A good experimenter in this field strives to be able to 
say that heredity alone played a differentiating part in the observed results; 
he must show that environment did not produce the obtained outcomes. 





pen Uncritical nongeneticists and some geneticists have construed this situation 
ate to mean that changing environment would make no difference. Actually, 
of heredity may set limits of development, but no one knows what those limits 
in are except under given conditions. As the conditions are altered the mani- 
‘ity festation of heredity may be markedly changed (321). Clear realization 
ese of this point of view should do much to stimulate fundamental research 
us in the entire field of child development. Some scattered studies, which 
or need elaboration and refinement, served to show that reaction to failure 
ise (332), motor skills (340, 341), artistic ability (368), singing ability 
15. (390), and other aspects of behavior formerly thought of as more or less 
itv predetermined may be greatly modified by appropriate environmental 
ed changes. 
“A Prediction 
ar- Several studies have been made of the predictive value of tests given in 
if- the early preschool period for test scores at later ages. Workers in this 
ive field seem agreed that the prediction value of these tests is low (277, 281, 
d’s 316, 354, 355, 361). This, in part, was accounted for by the fact that the 
ent early tests were largely motor tests. Other factors which may help account 
nd for the poor predictive value were the variations in growth curves of the 
children under consideration. Thus, studies such as those of Shirley (374), 
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Scammon (370), Meredith (347), and Boynton (284) clearly showed that 
growth curves are not uniform or constant. This is of extreme importance, 
for most prediction of subsequent behavior as now practiced assumes that 
the growth is constant. The more carefully children are measured, the 
more clearly it develops that their growth curves may be highly irregular 
in shape. Consequently, any accurate evaluation of the child’s behavior 
must of necessity deal with his particular developmental curve or curves. 
Ascertaining the stage of development, or maturity level, at any one time, 
however, does not permit accurate predictions beyond that point for it 
gives no indication whether the child is going ahead, standing still, or even 
going backward—knowledge that is all important for prognosis. Further- 
more, even if the direction of development is known, it becomes important 
to know the rate of development because future status obviously will differ 
markedly as that rate ‘is slow or fast. 

Equally important with direction and rate of growth is the matter of 
pattern of development. As Scammon (370), Meredith (347), Boynton 
(284), Dearborn and Rothney (292), and others have made clear, the 
different parts of the body do not grow at the same rate. Likewise, the 
different aspects of the child’s total development vary in relation to one 
another. Whenever the adjustment of the child, or any other component 
of his total pattern is altered, other aspects of the child may also be altered. 
It often is impossible to measure one aspect of child development and have 
it stay put while any of the other aspects are subjected to change. The im- 
plications of the above considerations for research of the broad and inte- 
grated type, involving frequent reexamination of the same children, are 
obvious. 

One interesting development in this field is the attempt of Richards and 
Newberry (365) to predict Gesell test scores on the basis of prenatal 
behavior as reported by the mother. Preliminary results are highly sug- 
gestive but cannot be taken at their face value because of the small number 
of subjects involved. 


Methods and Technics 


Technics of research in physical growth and anthropometry were re- 
viewed by Meredith in the February 1939 issue of the Review of Educational 
Research (348). Greulich and his colleagues (312) published a manual 
describing in considerable detail a great variety of technics suitable for 
genetic studies particularly during the adolescent period. Maresh and 
Deming (343) compared the roentgenographical and the anthropometrical 
technics in the study of the growth of the long bones and concluded that the 
measurements from roentgenograms are superior to those from the corre- 
sponding anthropometric measurements. Pyle and Menino (362) compared 
the Todd and Flory bone atlases and concluded that the Todd atlas was 
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superior for skeletal age assessments from birth to five years of age. (This 

was the age range covered in their study.) 

In addition to the methods and technics described in the above references 

and the various standard devices and instruments for measuring educa- 

tional growth, the Evaluation Study (364, 389) has devised and published 

tests for measuring reasoning, interpretations of data, interests, and the 

like. A recent issue of the California Journal of Secondary Education (385) 

is devoted to a symposium dealing with the evaluation of intangibles such 

as critical thinking. Janney (320) has published a technic for the measure- 

ment of social adjustment. 

Mathematical analyses of growth data have been made by Abernethy 
(272), Courtis (288), Davenport (291), Lumer (338), Scammon (369) 

and Weinbach (392). Jenss and Bayley (323) developed a growth equa- 
tion which is of considerable value in dealing with certain types of data. 
Shuttleworth (376) dealt with the inadequacies of ordinary mass statistics 
in dealing with data of the longitudinal type and he also emphasized 
the special values inherent in longitudinal data. Anderson and Cohen (276) 

published a study showing that only those cases in a longitudinal series 
which are complete should be retained for tabular and graphic presenta- 
tion and for statistical analysis if maximally consistent and meaningful 
results are sought. 

Weinbach (392) used special growth equations in his study of devel- 
opmental data on electroencephalograms in children. Grass and Gibbs 
(311) described a technical advance in the study of brain waves which 
consists in the use of a photo-electric analyzer which gives a “frequency 
spectrum” of the electroencephalogram. 

If the genetic method is to be optimally useful, new statistical proced- 
ures may be needed. Thus, Jones (331) in the February 1939 issue of the 
Review of Educational Research pointed out that “it is conceivable that 
such a comparison, based on a parallel series of points (on actual growth 
curves) rather than on a correlation at one point in time or on the correla- 
tion of increments between two points, may reveal simultaneous variations, 
or systematic variations with constant time lags, which could not be dis- 
cerned by ordinary mass statistical methods” (331: 93). 


Critical Evaluation of Genetic Approaches 


Although considerable work has been done with the genetic method in 
the field of child development, much remains to be done. For the most 
part, workers in the field have attacked relatively isolated problems or 
else have made use of inadequate technics, controls, or samples. Some of 
the best work has been done in collecting various norms and the resultant 
tendency to emphasize or evaluate development in terms of these norms 
is thoroughly understandable although equally mistaken. The important 
thing in the case of any particular youngster is not to know where he 
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stands in terms of group norms, but how he stands with respect to his 
own potentialities and whether or not these potentialities may be altered. 
Moreover, it is impossible to overemphasize the fact that any particular set 
of measurements of a youngster represent single points which tell nothing 
of the rate or direction of the child’s growth or development. Without 
this added knowledge, comparative scores are distinctly misleading and 
predictions for the particular child are apt to be invalid. In addition to 
all this, it must be clearly borne in mind that even after all the above 
conditions have been fulfilled, the obtained evaluation of the child is 
accurate only within the particular culture pattern of the group of which 
he is a member. We really want to know what specific factors produced 
the particular outcomes, the extent to which they are operative, and how 
they may be controlled so that undesirable factors may be eliminated 
or minimized and desirable factors safeguarded and accentuated. 


Need for Advance Planning 


The fact that the best uses of the genetic method involve longitudinal 
or seriatim studies of the same children over long periods of time means 
that any errors or oversights in either planning, testing, experimentation, 
or final evaluation will be extremely wasteful and costly. Consequently, 
this method requires very careful planning, continual critical self-evalua- 
tion, and the use of the very best data-gathering devices and technics. 
Work such as that of Marshall (344) and Meredith (346) showing how 
often physical and anatomical data should be collected, and the optimal 
interval between collections of data, should be extended to other aspects 
of growth and development. Cumulative record forms of various sorts are 
being carefully prepared and utilized in connection with the genetic method 
(302). In addition it should be noted that the planned experimentation 
technics of Fisher (300) are being brought into the field of child develop- 
ment and should prove of great value. It also is important that there be 
very careful planning of the objectives and that appropriate measuring 
devices be available particularly where the research deals with the relative 
intangibles that are of such great importance. 


Conclusions 


1. The bulk of the research using the genetic method has been descrip- 
tive in character and has used the normative approach but there is an 
increasing attempt to assess the factors that cause or influence particular 
outcomes. ’ 

2. The genetic method is being used increasingly for the developmental 
study of the relatively intangible aspects of growth. Along with this change 
of emphasis new technics and methods of research have been developed. 

3. The inadequacy of conventional mass statistical procedures for 
optimal analysis of developmental data is beginning to be realized. 
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4. The need to study origin, rate, direction, and patterns of develop- 
ment before prediction and control of child development will be pos- 
sible needs to be increasingly recognized. 

5. The extreme complexity of comprehensive longitudinal studies, the 
inability of many groups to attempt such work, the fact that the collec- 
tion of developmental data of the longitudinal type requires years of careful 
work, work which often cannot be duplicated, makes the publication of 
raw data so that others may utilize them, and help in their interpretation, 
an extremely significant development. This publication of raw data has 
been done by both Dearborn (293) and Davenport (290) and should 
certainly be encouraged and extended. 

6. In order that only duplication of research effort which is carefully 
planned may exist, it is necessary that research programs be made avail- 
able shortly after formulation, that there be continual publication of 
problems attacked, technics used, and raw data collected, in the forum of 
progress reports. 
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CHAPTER VIII 


The Interview’ 


RUTH STRANG 


Tue INTERVIEW is one of the methods used for obtaining information 
about processes, end results, or attitudes and feelings. It may be classified 
somewhere between the unstructured “projective technics” and the rigidly 
controlled standardized test. While widely used as a counseling technic, 
especially for the purpose of therapy, the interview is accepted as a re- 
search technic only with reservations. In order to understand both the 
values and limitations of the interview as a method of collecting data, 
it is necessary to review its complexity, its reliability and validity, and 
types of researches in which it is employed. 


Complexity of the Interview 


In form the interview covers a range from a casual conversation to 
a standardized interview-test such as the Binet. It may be genuinely desul- 
tory, apparently casual, or obviously premeditated (412: 3). Its content, 
even more varied than its form, may be purely factual, for example, 
census data; or highly subjective and personal, as in a psychoanalytical 
interview. The persons interviewed require an infinite variety of approaches 
in order that the desired information may be elicited. In short, “there is 
an infinite number of individual differences in the interviewee, the in- 
terviewer, and the relation between the two, as well as in the setting 
and content of the interview. Any particular interview is influenced by 
a sequence of events in the past and a foreshadowing of future plans” (416). 
Interviewing is a complex process demanding both personal qualities and 
training in the interviewer (415, 420). This uncontrolled complexity 
must be recognized in any discussion of the interview as a technic of 
educational research. 


Reliability of Interview Data 


The concept of reliability and validity will be a function of the kind 
of information sought as well as of the specific method employed. It is 
therefore more exact to speak of reliabilities rather than reliability, for 
the dependability of any instrument varies with the group, with the form 
of the instrument, and with the skill with which it is used. With such an 
unstandardized technic as the interview, the problem of ascertaining 
its reliability is exceedingly complex. There can be no single reliability 
coefficient for the interview as an instrument of research. One would 
expect interviews of the census type to have a relatively high reliability 
as contrasted with interviews of the personal reaction type. 

1 Bibliography for this chapter begins on page 607. 
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The reported reliabilities for specific kinds of interview with particular 
groups are few in number. Jenkins (407) tested the dependability of a 
series of nineteen questions by repeating the original interviews after 
forty-eight hours. On the average, 90 percent of the respondents named 
the same brand of goods on the second interview. The range for the dif- 
ferent items was from 85 to 97 percent; the average deviation was 2 
percent. In the same field of investigation, Link (408) attempted to 
ascertain how many interviews are necessary for results of a certain 
accuracy and prepared a table showing the standard deviation for 
samples of different sizes in which given percents of answers are to be 
expected. A close agreement was found between the expected variation 
for samples of five hundred. 

The reliability is far lower for less definite and objective information. 
Hollingworth (406:114-23) and other investigators have supplied evidence 
of the variability in information obtained from the same subjects or from 
two comparable groups by different interviewers. The diagnostic unreliabil- 
ity of the interview under counseling conditions must be recognized. 


Validity of Interview Data 


The problem of ascertaining the validity of information obtained by 
means of the interview is even more complex than that of reliability. 
Jenkins (407) determined the validity of the information on brands of 
goods obtained through interviews by comparing the brands reported as 
last purchased with the sales slip records of actual list purchases. The 
agreement between the results from these two methods of collecting data 
was, on the average, 78 percent, with an average deviation of 10. The 
investigators concluded that “while one may safely assume the reliability 
of last purchase questions, empirical investigation is necessary for each 
product whenever it becomes desirable to deal with validity.” 

Correlations between interviewers’ estimates of students’ ability and 
their actual marks reported by Clark (402) ranged from -++-.66 to +.73. 

In fifteen interviews with fifteen children in the sixth grade, 87 percent 
of 4,095 answers were identical with those given by the same children to 
corresponding questions in the social-distance questionnaire (421). Some 
evidence was obtained that these children were answering on a more 
rational basis in the interview than when filling out the questionnaire. It 
is difficult, however, to know which of these two technics is the more 
accurate. The questionnaire may be less efficacious than the interview so 
far as there is more likelihood of misinterpretation by both parties. On the 
other hand, data obtained by means of the questionnaire may be somewhat 
more objective and candid because the influence of the investigator’s own 
attitudes upon the subject is felt less and his replies may be made anony- 
mously (418). In some cases the individual’s first quick reactions on the 
questionnaire may be more valid than those which are given in an inter- 
view with more thought and analysis. Moreover, individuals vary in their 
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responses, some giving more authentic replies on questionnaires, while 
others make more accurate responses in the interview situation. 

Various attempts have been made to standardize the interview. Snedden 
(414) developed an interview technic which compared favorably with 
intelligence tests as a measure of mental ability. Maizlish (409) modified 
Snedden’s disguised vocabulary test into a “Likes and Dislikes Question. 
naire.” Standardization of the interview procedure does not necessarily 
promote reliability and validity, because the highest authenticity is ob- 
tained when the approach is so skilfully varied for each individual that 
he will make his most habitual, sincere, and accurate response. The highly 
standardized interview limits the adaptability of the interviewer. Moreover, 
general impressions may be more significant than details of circumstances. 


Psychological Limitations 


Self-concern, inaccuracy of observation, memory, and judgment, as 
factors causing unreliability in the interview, have been studied in psycho- 
logical laboratories. In the opinion of Woodworth (419), self-concern is 
a most difficult psychological factor to deflect from coloring any testimon) 
except that regarding matters which are extremely definite, objective, and 
impersonal. If, however, the subject’s self-concern is turned toward ascer- 
taining the facts of the case, greater accuracy may be expected. Not only 
may the person interviewed be subject to an unconscious bias of his own 
but he may also catch contagious bias from the interviewer. This danger 
was neatly demonstrated by Rice (413). The intrusion into the data of the 
interviewer's preconceived ideas seriously limits the value of the interview 
for purposes of scientific research. 

Recent experiments in the psychology laboratory of Cambridge Univer- 
sity (405) were designed to study the effect upon the interviewer’s judg- 
ment of his candidate’s character, “of a favorable or unfavorable impres- 
sion about the candidate given beforehand.” The interviewers were five 
young postgraduates, all of whom had had some experience in social work. 
The subjects were six boys. Although the conclusions from this experi- 
ment cannot be other than tentative, several important findings and 
hypotheses emerged: 

1. The bias introduced into instructions given to an interviewer did affect his 
judgment in about 40 percent of cases in which it was applied. 

2. The actual influence of the bias was not recognized by the interviewers. 

3. If the possibility of a bias is recognized, its influence may be combatted. 

4 


. The general impression gained of the person interviewed might result either 
in a resistance to or a reenforcement of the bias. 





Values of the Interview 


The interview is more important in the study of attitudes than in the 
study of objective facts; it is more important in the study of certain proc- 
esses than in the study of end results. Because of its value in the discovery 
of new relationships it may be appropriately employed in the early explo- 


ratory stages of researches (411:22). For example, Nestrick (410) em- 
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ployed a standardized interview technic in the study of the constructional 
activities of adult males. His investigation suggested possible variations 
of the method and throws some light on problems of sampling, validity, 
and reliability. If skilfully used, the interview may also be employed to 
obtain a deeper insight into complex problems than is possible by means 
of any other procedure (404). Thus, the interview is an essential technic 
for ascertaining the “why” of relationships and the subjective factor, 
possible causes, and meanings behind objective factors. It is these meanings 
that make educational research significant and functional. 

Perhaps one reason why educational research has been so preponder- 
antly concerned with mass investigations and end results is that research 
technics for the study of individuals and processes have not been adequately 
developed. The interview is one means of studying the processes by which 
pupils arrive at certain educational outcomes. Several examples will suffice 
to illustrate this use of the interview, one in the field of arithmetic and one 
in the field of reading. By means of the interview, Burge (400) made a 
study of the mental processes by which pupils in Grades IV, V, and VI 
arrive at answers in multiplication. By this method he obtained a number 
of errors and questionable habits not self-evident from test papers. Brueck- 
ner (399) and Buswell and Lenore (401) used the interview to obtain infor- 
mation about the pupil’s uses of number, his interests in arithmetic, his 
methods of work, and his difficulties. Dewey (403:41-42) studied the 
nature of the reading process by means of the interview. He used a radio 


microphone attached to a dictaphone to obtain a permanent record of the 
interviews and later had the record transcribed by a typist. His recorded 
interviews give a more significant and intimate picture of the pupil’s com- 
prehension of a printed passage and of the stumbling blocks in the way of 
effective reading than any battery of tests now available. 


Trends in the Use of the Interview 


Eight years ago an analysis of technics used in research related to 
personnel work broadly defined showed that only 4 percent of the re- 
searches examined employed the interview as a technic of collecting data 
(417). Although a similar tabulation has not been recently made, the 
writer’s impression is that the interview has been used more frequently dur- 
ing the past ten years than in the preceding decade. 

Trends in interviewing procedure seem to lie in two diverse directions. 
In one direction is the emphasis upon standardization; in the other direc- 
tion is the tendency toward presenting to the subject as unstructured a 
situation as possible. The trend toward standardization is perhaps justified 
when the interview is used to obtain quantitative facts. For the more im- 
portant use of the interview in obtaining information about individuals’ 
attitudes, values, and morale, the interviewer’s flexibility and adaptability 
to individual differences are essential. 

Many intriguing experimental problems are involved in the technic of 
the interview, very few of which have as yet been subjected to research. 
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CHAPTER IX 


Questionnaires’ 


FRANK W. HUBBARD 


No ATTEMPT is made in the present summary to reconsider various ques- 
tionnaire studies included in the February 1934 issue of the Review o/ 
Educational Research. This earlier number, in dealing with the methods of 
educational research, touched briefly upon studies using the questionnaire 
and, in some instances, upon the value of the questionnaire technic. 

For full treatment of how and when to use questionnaires, attention is 
called to Koos’ small volume (428), to a Research Bulletin of the National 
Education Association (431), and to the volume on research methods by 


Good, Barr, and Scates (427). 


Complaints 


The questionnaire has the dubious honor of receiving more criticism in 
print than almost any other research technic. A superintendent of schools 
describes it as “ubiquitous, ineluctable, and a confounded nuisance” which 
takes time that belongs to the public. A writer in a lay magazine contends 
that “freedom from questions” is becoming as important as freedom of 
speech and press. A recent opinionnaire type of inquiry has been described 
by a layman as “bush league idiocy; it belongs to a selected company of 


the most preposterous documents since the invention of paper.” 

A scanning of complaints reveals these specific criticisms: (a) the num- 
ber of inquiry forms is overwhelming; (b) the time required to collect and 
to record the desired information is prohibitive; (c) the types of questions 
asked are often too personal and confidential for public tabulation; and 
(d) the information sought cannot be reliably obtained by means of the 
questionnaire technic. 

Investigations with reference to the questionnaire as an instrument of 
research correspond in some degree to the foregoing list of specific criti- 
cisms. However, the number and length of questionnaires and the persongl 
nature of the data are not peculiar to the questionnaire technic. The inte 
view technic can be equally ubiquitous, time consuming, and personal. 
Most investigations, therefore, are concerned with the fourth specific criti- 
cism, primarily questions of reliability and validity. 


Reliability of Questionnaire Data 


Several investigations have been concerned with the problem of whether 
or not the same questionnaire given the second time after an interval 
produces the same results. Cavan (424) had 123 pupils in Grade VIII 
answer a questionnaire twice with a week’s interval. Between the first and 

1 Bibliography for this chapter begins on page 608. 
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second trials there was agreement on 87 percent of the questions. There 
was 97 percent agreement on factual questions about self; 78 percent 
agreement on attitudes toward self. For a “neurotic inventory” a total of 
83 percent of the replies were in agreement on the two forms. There was 
no appreciable difference in the percent of agreement between boys and 
girls. When certain questions were selected (to form a scale to measure 
home background) and were assigned scores, a correlation of .81 was 
obtained between scores obtained a week apart. 

A similar study made by Bain (422) dealt with 61 items of factual 
family data, factual personal data, and subjective personal material. The 
information was asked of 50 college freshmen and repeated again two and 
one-half months later. Twenty-five percent of the 3,050 responses were 
different in the second set of replies. The women students showed consid- 
erably greater stability of response than the men. A second experiment 
made use of a list of 60 items with 22 men and 28 women in the second, 
third, and fourth year of college. Again nearly one-fourth of the responses 
were different in the second trial. The greatest amount of change occurred 
with the subjective personal items; the least change took place with factual 
personal data. The greater stability of the women was found to be reliable. 

Peatman and Greenspan (432) submitted 35 statements of superstitious 
beliefs and 35 statements of scientific beliefs (as camouflage material) to 
431 colored children in the fifth and sixth grades of a New York City public 
school. The questionnaire was given twice to the group with an interval of 
one month. The retest reliability of the questionnaire was .958. Ranking 
the 35 superstitious statements according to the frequency with which they 
were subscribed to by the group on each administration, the correlation 
of these ranks was .97. The authors concluded that, properly devised and 
administered, a questionnaire on superstitious beliefs was a reliable instru- 
ment for obtaining information. 

Questionnaires pertaining to health habits and health efficiency, given 
to children seven to sixteen years of age by Seham and Schey (433), re- 
vealed reasonable reliability. Between the two questionnaires on health 
habits there was a similarity of response of 92 percent; the same corre- 
spondence was also observed between health efficiency tests. 

Lewis (429) investigated primarily the reliability of the replies of 216 
teachers to questionnaires dealing with personal data. For the city teachers, 
comparisons were made between the new questionnaire and replies to an 
earlier survey blank; the responses of county teachers were compared with 
personnel records in the county superintendent’s office. On the average, 
county teachers made 7.5 response variations; city teachers, 8.2 variations 
in response. Seventy-five percent of the teachers made from six to ten 
response errors. On the most troublesome item of the blank 96 percent of 
teachers showed variations; 16 percent showed variations on the least 
troublesome item. There was a definite tendency for respondents to seek 
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to place themselves in a favorable light. Questions calling for approxima- 
tions, for example, number of hours of extra school duties, seldom corre- 
sponded on the two questionnaires. The investigation emphasized the 
necessity of clear directions in how questions should be answered. The 
investigator concluded that many responses on questionnaires calling for 
personal data are likely to be unreliable. 


Validity of Questionnaire Data 


Several studies have been made in the period under review to discover 
the conditions under which questions obtain the facts for which they are 
designed. 

Stoke and Lehman (437) reached the conclusion that the questionnaire 
is “peculiarly vulnerable when employed for collection of personal infor- 
mation or when used with subjects who see (or imagine they see) an op- 
portunity to advance their personal interests by means of the returns made 
by them.” Students in seven classes in college psychology were asked to 
report the number of times they had taken books from the reserve desk 
for assigned work. It was possible to check the replies at the library desk. 
It was found that seven in eight students overestimated the number of 
check-outs. Students of A and B scholarship exaggerated the least; C and D 
students the most. The investigators concluded that one could not rely 
upon the statements of students with regard to the amount of time given 
to study. 

Smith (435) found that, while college students varied widely on their 
ability to judge the length of a line, the average guess of the group was 
close to the truth. He also asked students and teachers to report which 
books they had read in a list containing a number of false titles. Twenty- 
six percent of the students made false statements. In another study the 
track records of 1,000 high-school boys as reported in a questionnaire were 
compared with their actual records. Validity coefficients, while higher than 
in most judgment studies, showed a strong tendency to overstatement. 
Smith concluded that questions involving judgment and personal data 
obtain responses colored by a constant error of overstatement. 

Walker (439) submitted to 2,229 junior college students an inquiry 
dealing with factual personal items having to do with age, sex, year of 
high-school graduation, and father’s occupation. The replies were checked 
against valid sources. The really serious differences occurred with regard 
to school progress, for students “do not deliberately seek the stigma of 
retardation.” 

Some investigators, conscious of the possible inaccuracy of answers to 
personal questions, have suggested the anonymous reply practice. Corey 
(425) tried an attitude test dealing with the question of classroom honesty. 
Two forms consisting of fifty paired items were used. In one case the stu- 
dents signed their papers and in one case they did not. He concluded that 
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“students are about as forthright in their expression when the question- 
naires are signed as when they are not signed. The concern of investigators 
over the invalidating effects of a signature may have been exaggerated.” 

It is possible that validity is affected by the form in which questions are 
stated. Burtt and Gaskill (423) used six wordings of questions with 
elementary-school pupils who had viewed a motion picture film. The ques- 
tions were asked orally and the children wrote on their papers “yes,” “no,” 
or “don’t know.” More than 5,000 answers were obtained for each form 
of question. The six forms of questions were: (a) Did you see a ? 
(b) Did you see the ? (c) Didn’t you see a — ? (d) Didn’t you 
see the ? (e) Was there a ? (£) Wasn’t there a ? The 
results of the definite versus the indefinite article warranted no conclusions.” 
There was some apparent tendency for the negative form to cause greater 
suggestiveness when categorical answers were demanded. The reverse ap- 
peared to be true in comparisons between questions in the subjective form 
with the definite article. The objective form, (e) and (f) of preceding list, 
clearly showed the greatest suggestiveness and also the highest degree of 
caution. 

In spite of apparent contradictions among the foregoing studies, at least 
three conclusions pointed out by Smith (436) offer a working basis: (a) 
some respondents are more dependable than others, hence a few question- 
naires circulated among competent people should give more valid data 
than a wider distribution which includes unqualified persons; (b) where 
respondents have standards or mechanical aids the agreement on judgment 
questions is high, hence the value of defining terms or supplying definite 
criteria where judgment is involved; (c) the opinions of a group as a whole 
are more valuable than those of individuals, hence the advisability of re- 
lying upon averages and other measures of group opinion. 

Lack of evidence of reliability and validity—Davis and Barrow (426) 
reported results of a critical examination of 500 questionnaire studies 
extending over a period of thirty-eight years. Their analysis shows a gen- 
eral failure to report evidence as to the soundness of the questionnaire 
procedures followed. Of the 500 studies only 293 reported the number of 
questionnaires sent out and returned; four studies reported coefficients of 
reliability. Three hundred and eighty made no statement as to validity. 


Increasing the Return 


Close to reliability and validity in importance is the factor of repre- 
sentativeness of sampling. Obviously, a high proportion of return is hoped 
for by every investigator because of the probably greater representative- 
ness of the responses. Various types of appeals or pressures have been 
used. A monetary stimulus was tried by Shuttleworth (434) in a study of 
adult attitudes toward the financial support of health work in New York 
State. In one area each inquiry blank was accompanied by a personal 
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letter and 25 cents; in another area only a personal letter and a stamped 
envelope were used. The coin area returned 52 percent of the blanks, while 
only 19 percent of the forms were returned in the noncoin area. Taking 
the cost into consideration, there was some doubt that the procedure of 
paying respondents could be recommended as the usual practice. 

Toops (438) examined 135 questionnaire studies seeking to isolate 
certain factors useful in predicting percent of returns to questionnaire 
studies. His listing of elements for obtaining a high percent of returns can 
be briefly stated as follows: 


(1) Select for study a topic in which the recipients of the questionnaire 
are vitally interested in knowing the answer. 

(2) Send the blank to those persons who because of personal friendship 
or knowledge of your professional repute will feel a personal obligation 
to answer. Promise to provide a copy of the results. 

(3) Employ a vigorous follow-up technic designed to touch motives as 
viewed from the angles of recipients. 

(4) Use best possible technic in writing questions. 

(5) Circulate questionnaire in those parts of the country where replying 
approaches a fixed habit. 

(6) Don’t tax the interest and effort of a recipient, but make it easy 
for him to reply. 

(7) Use objective, unequivocal, but “sensible” questions. Be chary of 
essay answers. 

(8) Employ advisedly such incidental pressures as “moral obligation 
to reply.” 

(9) Send questionnaires early in the school year before the pressure 
of duties decreases the chances of attention. 

(10) Don’t worry about the length of the blank if the rules are followed; 
but undue length may be a symptom of slovenly technic. 


Trends and Innovations 


One of the most interesting practices that has appeared in recent years 
has been the use of pictures and sketches in questionnaire blanks. Perhaps 
the widest utilization of this type of blank has been in the field of discov- 
ering the consumer opinion. General Motors Corporation has used the 
procedure for a number of years in ascertaining what potential purchasers 
wish to find in automobiles. 

A most elaborate pictorial questionnaire was circulated by the University 
of Minnesota in discovering what alumni were doing and what their expe- 
riences have been (430). The blank included 45 printed pages of questions, 
30 half-tone pictures, and 4 pages of line sketches. The four parts of the 
blank were: (a) earning a living; (b) home and family; (c) socio-civic 
affairs; and (d) personal life. Very little space was provided for written 
comments or items. Respondents were called upon to rate items, check 
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best descriptive statements, and to follow other types of replying found in 
so-called new-type examinations. 

In addition to the appearance of illustrated questionnaires, there has 
been an increased tendency to use the questionnaire for a wide variety 
of purposes. Early practice used it primarily to collect factual data which 
could be treated statistically. Gradually opinion or judgment questions 
were introduced and many investigators compiled the replies in statistical 
form. As indicated in studies reviewed in preceding paragraphs, the judg- 
ment reply, and the essay reply particularly, have been viewed with some 
skepticism. However, the problem has shifted due to the changing use of 
opinion-type blanks. Recently questionnaires have been used to arouse the 
interest of principals in supervisory problems and the enthusiasm of high- 
school students in school management. In these instances the primary inter- 
est of the investigator has been to stimulate discussion—not to obtain 
statistical measures of conditions. Under such circumstances reliability, 
representativeness, and validity appear, at least to the investigators, as 
relatively unimportant. 

A third innovation has been the increased use of “sounding out” ques- 
tionnaires. Usually this means a brief inquiry form, possibly a postcard, 
designed to ask the recipients to participate in a more detailed investiga- 
tion, or to discover conditions meriting more careful study. This procedure 
has been used by the Research Division of the National Education Associ- 
ation in several studies. Usually this procedure results in case studies 


rather than comprehensive surveys. The plan has the advantage of excusing 
recipients from attempting to fill out blanks in which they have no interest 
or upon which they have no information to supply. The possible disad- 
vantage is that these selected reports will be assumed to be representative 
of general conditions. 








CHAPTER X 


School and Community Surveys’ 


JESSE B. SEARS 


To VIEW A SCHOOL OR COMMUNITY SURVEY as a distinctive method of 
research may be questioned, and the use of the term here may be contrary 
to the notions of those who see in the survey not one but many research 
methods. This brief section is not expected to close discussion of this 
question. It will, however, attempt to review the research activities related 
to surveys, accepting the classification on the ground that there is some- 
thing distinctive about the survey as a research method—the distinctive 
thing being that from the standpoint of its procedures a survey is some- 
what more than the sum of its parts. 

While a survey includes many separate studies, each of which has a 
special purpose, special data, special technics, and special procedures, 
there is also a dominating over-all purpose, and it is with reference to this 
that choice of data and form of treatment are determined. As a single 
musical note is one thing when sounded alone and a different thing as part 
of a symphony, so it is with studies of the curriculum of a school system, 
of the trend in costs, of the social composition of its staff, or of the relative 
efficiency of its pupils in spelling—when treated separately, and again in 
a survey. Alone and separately these are simple studies; properly com- 
bined in a survey, they contribute to the over-all purpose of the under- 
taking and concern themselves less with isolated findings. The point 
of emphasis in this review of research is not upon the survey movements— 
their purposes, development, spread, and outcome, as such—but upon the 
research methods they use. 


Social Surveys 


A social survey stands to the field of social problems—poverty, slums, 
crime, exploitation of workers, race difficulties—as the school survey stands 
to educational problems. The reason for including the literature on the 
social survey here lies in the similarity and overlapping of the social and 
the school survey movements. Every comprehensive community survey 
collects social data and uses methods useful in an educational survey. 
Since school programs rest upon social as well as psychological founda- 
tions, school surveyors have direct use for many of the technics and data 
of the social survey. 

History—The social survey movement reaches far back in history. 
though the current movement in this country antedates school surveys by 
only a decade or two, even if we regard the work of such men as Jacob 
Riis, Theodore Roosevelt’s commission, and Lincoln Steffens as having 

1 Bibliography for this chapter begins on page 609. 
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initiated the movement. The social survey commonly regarded as the first 
in this country was that of Pittsburgh by Kellogg (459) in 1909, while 
the first school survey came in 1910. From the point of view of research 
method, however, the earlier scientific studies of society by LePlay (460) 
in 1855 and by Booth (441) in 1892-97 cannot be ignored. Even now the 
purposes, methods, and technics of these men are in use in community 
surveys. 

In the field of social surveys there is already an extensive literature on 
methodology itself, to say nothing of the many works that have dealt with 
the question of a science of sociology and with the surveys themselves. 
The community survey is one of the instruments of the sociologist, as it is 
the chief instrument of the scientific social worker—whose aim is social 
reform rather than that of developing a theory of society. 

For the general orientation of one who wishes to understand this move- 
ment, mention should be made of the works of LePlay and Booth, above 
noted; of that by Howard (455) in 1779, to mention a very early study; 
of those by the Residents of Hull House (456) in 1895; by Riis (470) in 
1890; and Lincoln Steffens (476) in 1904. In such investigations and in 
many other studies of those years will be found the foundations of present 
methods of community survey work. In many of these surveys where the 
aim was social reform rather than scientific study, one needs to look 
behind the interpretation for the method. In all these studies, masses of 
facts were assembled and analyzed. Steffens worked by the methods of a 
reporter and a detective. He studied cases and traced connections. Riis 
described and painted a picture of the life of poverty. Booth and LePlay 
used statistical methods. They gathered and counted facts and added them 
up. Incident to their work they formulated extensive schedules and ques- 
tionnaires as research technics. 


Recent Methods of Social Surveys 


By the 1920’s the movement was well established and the question of 
methodology began to receive more attention as such. Due probably to 
the social reform viewpoint, the earlier idea of studying an entire com- 
munity in all its aspects gave place to studies of special features. Some- 
times one or, as in the case of Harrison’s Springfield survey (453) in 1920, 
as many as nine separate aspects of the community were studied. A note- 
worthy feature of methodology of such surveys is found in the use made 
of local talent. Here again the reform idea dominated. Not mere descrip- 
tion or scientific generalizations, but social improvement, was the goal and 
for this local interest an organization had to be built up; hence, the idea 
of cooperation in the studies. As early as 1920 Chapin (447) published a 
volume dealing with methods for examining documentary sources, how to 
collect data, methods of sampling, types of schedules, and methods for in- 
terviews. In 1928 Petit (468) demonstrated a method for handling case 
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studies designed to assist in building up social welfare agencies, and 
Palmer (467) published an effective explanation of methods then in use, 
affording a helpful analysis of many separate technics. 

In 1929 Odum and Jocher (465) wrote their /ntroduction to Social 
Research; Lundberg (461) published his study of methods for gathering 
data; and the Lynds (462) brought out their first study of Middletown, to 
be followed eight years later by a second check-up study (463). These 
Middletown studies are demonstrations of research clearly conceived and 
well carried out. If they do not open up entirely new methods in detail, 
they do reveal illuminating refinements and extensions of many methods 
and especially contribute to the concept of a unified scheme of investigating 
a total community situation. 

As a sample of more recent works, note should be taken of Fry’s book 
(452) dealing with technics of social investigation, of selected articles in 
the Encyclopaedia of Social Sciences—especially that by Carpenter (442) ; 
of the recent series of monographs published by the Social Science Re- 
search Council dealing with the social aspects of the depression (474) ; 
and of Young’s 1939 treatise (477) on the research methods used in social 
surveys. 

A widely varied but extremely stimulating display of research methods 
is to be found in the reports of the many state planning commissions. 
Bibliographical work in this field is well done. From such works as those 
of Fry (452), Lundberg (461), Odum and Jocher (465), Young (477), 
and Eaton and Harrison (449)—which listed some 2,700 social surveys— 
the literature is readily available. 


School Survey Trends 


During recent years there has been an increasing social emphasis in 
education. Curriculum changes, guidance and teaching technics, library 
development, playground supervision, and even the physical facilities of 
a school system—all reflect this emphasis. There is a movement away from 
emphasis upon books and upon subjectmatter as such; away from build- 
ings, playgrounds, and apparatus as having laws unto themselves; and 
away from pupil management and school administration of the mechani- 
cal and strictly authoritarian type. The school has become more of a social 
enterprise. These new educational purposes are reflected in new survey 
objectives and new survey methods. 

Analysis and classification of surveys—Caswell (444, 445, 446) exam- 
ined a “large number” of reports and set up classifications of surveys as 
to scope, agency in charge, purposes, and technics used. Eells (450), 
Sears (472), and Ozanne (466) offer other suggestions. The assumption 
of these investigators appears to be that the survey is a research method. 
For higher education the studies of survey reports by Eells (450) and 
Heston (454) are by far the most thorough and detailed. Heston made a 
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topical analysis of what surveys had studied and of the recommendations 
of the surveys. Eells studied 230 printed, 70 mimeographed, and some 300 
manuscript survey reports. He classified the reports as to technics used, 
as to general viewpoint applied, and as to the form of report. It is clear 
enough from Eells’ analysis that the nature of the institution to be sur- 
veyed has dictated the research approach, the specific technics to be used, 
and the special combinations of these. 

Bibliographies—Of special value to one desiring to see the school sur- 
vey in its broader aspects, as an instrument of inquiry aimed at educational 
betterment, the bibliographical work in the field is important. Following 
their earlier work Smith and O’Dell (473) brought their bibliography of 
surveys and of writings on the survey down to December 1937. Eells (450) 


provided an equally exhaustive bibliography for the field of higher edu- 
cation to 1936. 


History and Development of Surveys 


A review of the history of the survey movement is given in three more 
recent reports by Caswell (443), Sears (472), and Judd (457), showing 
something of the origin and development of the movement, and the nature 
of the research concepts that dominated the actual work done. Lack of 
sound research methods available for this kind of work was early brought 
to light and a strong stimulus was given to research devoted to the develop- 


ment of suitable technics. The historical and comparative methods were 
applied where possible; the then new technic for revealing retardation was 
perfected and used; the test movement was brought to bear and given a 
healthy stimulus; and the interview method was used. Very early also the 
general purpose of the survey was debated, some hesitating to go beyond 
the evaluation purpose. In time the more constructive aim dominated and 
gradually the idea of a survey as a unified study of all parts of a school 
system, by many separate methods and separate investigations in combi- 
nation, with a view to improving school practice, brought the survey to 
its present position. 

In the field of higher education the history of surveys has been devel- 
oped by Heston (454) and Eells (450). Combined, these studies offer an 
exhaustive review not only of the movement in its general aspects but also 
of the problems it attacked and the methods applied. 

In this general connection, also, a recent volume of the report of the 
Advisory Committee on Education, prepared by Judd (458), presented a 
clear picture of the survey as a feature of the research work done by the 
United States Office of Education. How this research function has been 
developed from the start in 1867 as one of two major purposes of this 
Office was traced, and the origin and development of its survey work were 
made clear. The actual developments within the Office of Education from 
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local to state and nationwide surveys offer an excellent illustration of the 
broader social view of the survey as a research instrument. 


Methods and Technics Used in Surveys 


If one viewed the many individual studies made in any given survey 
he would not find many entirely new research procedures. It is in the 
combining of these studies that one finds something different in each 
survey. To answer any of the questions raised by a survey may call for 
one or several researches. The researches individually may involve nothing 
of importance by way of new method, though to have raised the questions 
and planned the approach may have required profound knowledge and 
insight and analytical ability. Volumes IV and X of Reeves’ University of 
Chicago survey report (469) offered illustrations of this point. This is to 
say that survey research is less difficult and less unique in its separate 
technics than it is in its earlier stages where discovery and definition of 
problems and an appropriate combination of methods of attack are 
determined. 

Another recent study in the field of higher education stands out both 
for its originality of approach and its detailed technics of study. The North 
Central Association Committee om Revision of Standards (464) was con- 
cerned with evaluation only—which is but one phase of a complete survey. 
In seven volumes the report offered a complete explanation of the problems 
it raised in fifty-seven institutions of higher learning, and gave a full 
description of the procedure followed and of how the results were used 
in establishing norms. More use was made of subjective judgment than 
has been common, but the treatment of such data was in many cases new 
and unique, as well as convincing to a scientific worker. A brief analysis 
of this report was presented by Sears (471). It is not too much to say 
that this study reflected a new viewpoint in evaluation, and in reality a 
presentday theory of college education applied in accreditation. This latter 
point was especially revealed by the sections dealing with the care, direc- 
tion, and encouragement of students. The study produced, also, a compli- 
cated but effective instrument of evaluation to be applied to an entire 
institution. The elements chosen as the indications of excellence were con- 
vincing and the statistical treatment of these in numerous cases was un- 
usual. The final tool devised is not new as a technic; it is a score-card 
type of instrument. Yet, it is quite new as to the elements of which it 
is made. 

In the field of building evaluation higher education surveys may expect 
to profit from the new standards devised for use with their score-card by 
Evenden, Strayer, and Engelhardt (451). These standards are not new in 
the method of their development but are new in the nature of the functions 
used as the basis against which needs for the various elements were judged. 

In the field of city school surveys, Caswell (445) analyzed the technics 
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used in nine survey reports to determine their treatment of secondary 
education. He found 34 separate problems had been studied and then for 
each problem he showed what and how data were assembled for its solu- 
tion. In a later study (446) Caswell noted seven distinguishable methods 
by which data were collected, as follows: analysis, score-card and rating 
scale, standard tests, case study, experiment, interview or questionnaire, 
and observation. He found thirteen technics for evaluating data or sep- 
arate features of the school system, as: five uses of comparative procedure 
(comparison with other units within the system, with comparable outside 
systems, with neighboring cities, with average practice, and with out- 
standing practice) ; equated groups; application of standards; test stand- 
ards; score-card or rating scale; measuring against research results; 
judgment of survey staff; expert opinion; and check against trends. Taken 
separately, the data used, the technics applied, and the procedures followed 
offer nothing striking as features of research methods. Yet, any but a 
veteran surveyor would find many new research ideas in the kinds of 
problems studied and in the data and approaches used in numerous cases. 

In the field of physical education Davis (448) studied 207 surveys of 
all kinds, from which he developed a checklist of methods and technics for 
use in this field. He then applied this list to 117 city school survey reports, 
checked it further against other research literature, and developed a rating 
scale with sets of forms and procedures to guide its use in a survey. 

A number of other lesser studies of survey methods could be cited, such 
as Blauch’s study (440), “Curriculum Surveys in Higher Education.” 
From what has been presented, however, it is clear that when one refers 
to the studies of separate problems he finds little that is new in research 
method. The new element is not the interview, question schedule, test, or 
score-card, but the way in which such technics are used in combination that 
reveals new trends in survey research. One cannot fail to note the differ- 
ence between survey treatments of curriculum, supervision, library, social 
program, and personnel problems in recent surveys (Stockton, California, 
and St. Louis, Missouri, for example), as compared with treatments of a 
decade or more earlier. In the earlier treatments one finds more bare 
facts, more counting and adding up, more emphasis on subjectmatter and 
books, more use of standards; in the later treatments he finds more anal- 
ysis, more concern with the child and with his personal development. The 
fitness of curriculum, guidance, and administration of buildings is now 
judged more by their contribution to the attainment of personal and social 
aims and less by their conformance to arbitrary standards. A forward 
look seems to indicate that it is toward clear thinking and careful use of 
facts rather than something distinctive in research technic that we are to 
look for development in survey research in the next few years. 





CHAPTER XI 


Testing: Intelligence, Aptitude, Personality, 
and Achievement’ 


G. M. RUCH and P. T. ORATA 


Inspection OF SEVERAL HUNDRED TITLES in the field of testing, chiefly 
between January 1938 and July 1939, indicated that intelligence testing 
continued to hold first place in number of articles published within the 
scope of this summary, despite enormous gains in activity in the measure- 
ment of aptitude and personality. The bibliography of this chapter, there- 
fore, represents a high degree of both quantitative and qualitative selection 
in order to place emphasis on new developments in measurement. The 
omission of much otherwise significant material rests solely on the basis 
that no outstanding uniqueness of method is involved. 


Intelligence 


The summaries to 1938 in the Review of Educational Research for June 
1938 by P. Cattell (501) and Keys (561) comprising, respectively, 68 and 
187 titles, need comparatively little supplementation so far as methodol- 
ogy and results are concerned. Developments since that date center chiefly 
in three areas: (a) further discussion of the 1937 L-M Scales of the 
Stanford-Binet, (b) renewed controversy over the constancy of the IQ 
arising chiefly from recent work at the state university of Iowa, and (c) 
the potentialities of factor analysis in psychological and educational mea- 
surement. 


The New L-M Stanford-Binet Scales 


The two books by Terman and Merrill (619, 620) presenting the L and 
M scales were the signal for renewed consideration of the Binet method, 
both for school use and for clinical purposes. Kent (559) presented 
suggestions for another revision, arguing that the age-scale method is 
wasteful and not well adapted to clinical practice, particularly when there 
is “nondiscriminative material at the upper and lower ends of the sub- 
ject’s natural range.” R. B. Cattell (502) criticized the intuitive method 
in test construction in applied psychology, particularly the Binet tests. 
Vernon’s reply (630) to Cattell and other critics of the Binet discussed the 
merits and limitations of both psychometric and clinical approaches, 
holding that the Binet embodies both methods. Both Cattell and Vernon 
listed numerous references. Bernreuter and Carr (487) and Merrill (572) 
discussed the significance of the IQ’s yielded by the new scales; the latter 
pointed out that the addition of new tests at both the lowest and highest 


1 Bibliography for this chapter begins on page 610. 
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levels of the scale results in a different interpretation of the quotients on 
old and new scales. For example, the lowest 2 percent of the 1916 sample 
of 905 cases tested 73 or below, but the lowest 2 percent of the 2,904 cases 
used in the new scale tested 70 or below. A new table for the interpreta- 
tion of L-M quotients is presented, with a disclaimer that classification as 
defective is possible by tests alone. Burt (496) compared the 1937 Binet 
with the English version of the 1916 scale. He found the new scale more 
effective in diagnosing the dull and defective, more reliable for showing 
the relative influences of a general versus other factors, but that the 
order of difficulty of subtests did not agree with that for English children. 


The Constancy of the Intelligence Quotient 


The nature-nurture controversy has received new attention as the result 
of a series of studies from the Iowa Child Welfare Research Station. 
Skeels (604), basing his conclusions on children tested 12 to 60 months 
after being placed in foster homes, reported in 1936 that “the mean level 
of intelligence of these children is higher than would be expected . . . 
from the educational, socio-economic, and occupational level represented 
by their true parents.” He found zero correlation between true mother’s 
1Q and child’s IQ. Later studies by Skeels and others (602, 603, 605) 
reached the same conclusion. Wellman followed her 1932 report (639) 
of a steady rise in IQ year by year for children attending the Iowa Pre- 
school Laboratories with four additional papers (634, 636, 637, 638) 
advancing similar conclusions. She said: “The extent of upward change 
that may take place is truly remarkable. We have examples of children 
entering preschool with average intelligence who, after especially favorable 
circumstances, have later tested at the ‘genius’ levels” (634). Stoddard 
(610, 611, 612) concluded that intelligence level is not fixed, but that 
a richer environment stimulates genuine mental growth. Skodak (606) and 
Crissey (509) also suggested that intelligence is much more responsive to 
environment than has previously been supposed. 

This group of Iowa studies has been vigorously challenged by Simpson 
(600, 601) who claimed that the significance of these studies of the 
“wandering IQ” is completely obscured by ambiguities and inconsistencies 
in tabular data, failure to report individual scores year by year, and 
failure to allow for selective factors in school-leavers. He argued that the 
rises in IQ mean nothing more than “a survival of the fittest.” Wellman 
(636) replied to Simpson’s interpretations, and in another paper (635) 
she showed a similar, but less marked, increase in Merrill-Palmer IQ’s 
under repeated testing. 


Factor Analysis in Psychological Measurement 


Cureton and Dunlap (511) summarized the work on the factor theory 
up to 1938 in the Review for June 1938, and Chapter XIII of the current 
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number, by Holzinger and Harman, brings the literature up to date. The 
publication of Thurstone’s Primary Mental Abilities (622) provided the 
educator with a more concrete picture of the types of test materials that 
mathematical analysis suggests as useful in the measurement of primary 
mental traits. It is sufficient at this point to call attention to the fact that 
the methods of factor analysis and intuitive analysis of psychological abili- 
ties present fundamental differences. The time for the production of mental 
tests based upon factor analysis is at hand; just what similar analysis of 
educational abilities will yield by way of new achievement tests and cur- 
riculum reorganizations is at present challenging but purely speculative. 
Alexander (478) and Feder (527) applied factor analysis methods to 
educational tests and Guilford (538) to the production of four new forms 


of Army Alpha. 


Books and Bibliographies 


Outstanding aids to the test worker are: Buros’ 1939 Mental Measure. 
ments Yearbook (494) and Hildreth’s revised Bibliography of Mental Tests 
and Rating Scales (551). The former provides (usually) two or more 
critical and independent reviews of tests published since the appearance 
of Buros’ two earlier monographs. An innovation is the inclusion of re- 
views of books in the field of statistics and measurement. Educational, 
aptitude, and personality tests are also considered. Hildreth’s bibliography 
is an extension and revision of her 1933 volume, with 4,279 titles classified 
under 17 headings. These two books provide virtually complete coverage 
of the field, and together constitute a working reference library of test 
materials. Revisions appeared of Freeman’s Mental Tests (529) and 
Inglis’s tables (554) of IQ values, which now provide the extensions 
demanded by the new Stanford-Binet scales. 


Miscellaneous: New Tests, Revisions, Reliability, and Validity 


Kuhlmann (564) presented a new intelligence scale resulting from his 
work with the Binet and Kuhlmann-Anderson tests. Norms were based 
upon 3,000 cases of ages from three months to adult. In addition to mental 
ages, scoring methods provided, above age nine, speed and accuracy 
ratings. The IQ was replaced by the PA (percent of average), as being 
less variable. Recent revisions are: Otis Quick Scoring (582); Detroit 
First-Grade Intelligence Test (523) ; and the Michigan Non-Verbal Series 
(536). Kerr (560) in England and Miller (573) in America continued 
to study the value of children’s drawings for the measurement of intelli- 
gence. Higginson (550) published an objective test of imagination and 
Carl (499) devised a test for older children and adults in which a hole 
is filled with blocks of geometric forms; a reliability of .88 based on 
1,508 adults and a correlation of about .77 with the Stanford-Binet were 
reported for the last mentioned test. 


516 





0. 5 


The 

the 
that 
nary 
that 
bili- 
ntal 
s of 
cur- 
tive. 
Ss to 


rms 


ure- 
"ests 
nore 
ance 
 Te- 
mal, 
phy 
ified 
rage 

test 
and 
ions 


December 1939 TESTING 





Strang (613) and Brill (492) discussed the validity of the Porteus 
maze test. Williams and Lines (641) evaluated the Ferguson form boards 
and derived new norms. Evaluations of other tests were reported as fol- 
lows: Metropolitan Reading Readiness Test and Pintner-Cunningham 
Primary Mental Test, Grant (535) ; Kuhlmann-Binet, Arthur (481) ; Goode- 
nough drawing test, McCarthy (569); CAVD tests, Pintner and Stanton 
(585) ; and the Spearman Visual Perception Test, Arsenian (480). Peatman 
(584) and Jackson (555) discussed the reliability and meaning of test 
scores. 

Blatz and others (491) presented the growth of the Dionne quintuplets 
in a series of five monographs. Outstanding conclusions are: The girls 
show general retardation, especially in language; there are marked and 
more or less stable personality differences that appear to be environmental 
in character; and the quintuplets are more retarded in speech than a 
control twin group. 

Watson (632) reviewed the intelligence movement, concluding that 
there is a great need for more studies of mental development. MacMurray 
(570) compared gifted and dull-normal children by the Pintner-Paterson 
and Binet scales. Wilson and Fleming (642) studied the intercorrelation 
of abilities in the first grade. Vernon (629) suggested that sophistication 
or test-wiseness may be an important element in test scores. 


Aptitudes 


Books and Reviews 


O’Rourke (581) examined more than 500 studies in aptitude measure- 
ment and reviewed 130 vocational aptitude tests in the Review for June 
1938. Outstanding books have appeared by Bingham (489) and Paterson, 
Schneidler, and Williamson (583). The former discussed the nature of 
aptitudes and the theory of their measurement. The latter is to be regarded 
chiefly as a handbook of directions, norms, and data on the validity and 
reliability of available aptitude tests, particularly those devised at 
Minnesota. 


Aptitude for High School and College 


Darley (512), Dickter (515), Langlie (565), and Selover and Porter 
(599) studied the use of psychological tests in predicting college success. 
Darley suggested that ability, attitudes, and college adjustment are prob- 
ably unrelated. Dickter found that the mathematical parts of the CEEB 
examination predicted success in mathematics, but the verbal elements were 
of little value. Seagoe compared the predictive values of certain achieve- 
ment tests with those of special aptitudes in algebra (596) and foreign 
languages (597). The New Stanford Achievement Test (arithmetic and 
reading) provided fully as good a basis for prediction as did the intelli- 
gence tests or special measures of aptitude in these subjects. 
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Aptitudes for the Professions and for Selling 


Dwyer (521) gave the Strong interest tests to 418 entrants to medical 
school. Four factors (physicist, journalist, minister, and life insurance 
salesman) proved of most predictive value. Harris (543) administered 
five mechanical aptitude tests to 68 dental freshmen; these, combined with 
intelligence scores, gave a multiple correlation of .67 with dental school 
work. Stump (616), Stuit (615), and Sandiford and others (594) at- 
tempted to find predictive measures of teaching success. Aptitude tests, 
scholarship ratings, and success in practice teaching all proved to have 
low predictive value. 

Lawe and Raphael (566), Dodge (516), and Bills (488) studied the 
values of tests for selecting salesmen. The former reported satisfactory 
results from tests employed at Harrods, Ltd. in London, and suggested 
the existence of upper and lower critical scores. Dodge listed nine items 
of the Bernreuter Personality Inventory that differentiate good and poor 
salesmen. Bills found that the life insurance selling and real-estate selling 
factors of the Strong interest test are related to success in selling insurance. 
Candee and Blum (497) devised a new scoring system for the Minnesota 
Clerical Test. 


Mechanical Aptitudes 
Drake (517, 518) and Drake and Oleen (519) evaluated various tests 


for selecting industrial employees and studied the psychological factors 
necessary to success on the job. Outstanding findings were: a new pin 
board with a reliability of .92 and a correlation of .59 with foremen’s 
ratings; a new hand-foot coordination test; and 30 percent savings in 
operation through dual, or two-hand, operation. O’Connor (579, 580) 
presented further analysis of the Black Cube, Work Sample 167. Candee 
and Blum (498) gave the O’Connor finger dexterity and tweezer dexterity 
tests to mediocre and superior workers in a watch factory; the latter 
proved valid but the former was not except at a lower critical level. Age 
and experience did not affect the finger dexterity test. Wells (640) pub- 
lished his fourth paper on four O’Connor tests. Hearnshaw (548) de- 
scribed selection tests for inspectors in a paper mill. Burr and Metcalfe 
(495) revised the norms on the I. E. R. Assembly Test. Babcock and Emer- 
son (482) analyzed the MacQuarrie mechanical ability test, reporting a 
correlation of .62 with the Binet vocabulary test—a correlation which 
increased with age as did the intercorrelations of the MacQuarrie subtests. 


Personality, Interests, and Attitudes 
Books and Reviews 


Watson’s summary (631) in the June 1938 Review of 329 titles pub- 
lished between January 1935 and December 1937 suggested the enormous 
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activity in the measurement of personality and character. Traxler (623) 
examined critically the leading tests in the field, listing 183 titles in his 
bibliography. Important books or monographs appeared by Thorpe (621), 
Murray (575), Garrett (533), Hartshorne and others (544), and Spencer 
(609). Thorpe’s volume gave a comprehensive survey of the literature 
from all fields of psychology. Murray presented a detailed study of fifty 
men of college age over a two and one-half year period. Garrett used 
Thurstone’s centroid method of factor analysis as a first step in defining 
and measuring personality traits. Hartshorne and others studied the in- 
tellectual, social, moral, and physical growth of 1,200 boys. Spencer’s 
volume presented the personality conflicts of high-school students as re- 
vealed by a paper-and-pencil questionnaire. 


Experimental Studies of Personality Tests 


The Bernreuter Personality Inventory continued to receive critical at- 
tention. Farnsworth (526) retested 319 college students with the Bern- 
reuter inventory at intervals of one, two, and three years. Responses 
proved relatively stable with time, as were intercorrelations. Jarvie and 
Johns (556) concluded that the Bernreuter inventory offers little aid in 
educational counseling. Nemzek (578) found the inventory of little value 
in predicting academic success as measured by teachers’ marks. Hayes 
(547) decided that college women with several older siblings tended to 
be more neurotic and less self-sufficient and dominant, a finding previously 
reported by Stagner and Katsoff. Bennett (486) further simplified the 
Flanagan method of scoring the Bernreuter inventory. 

Research on the Rorschach Ink Blot Test took mainly the direction of 
standardization of this clinical method. Troup (624) applied this test to 
twenty pairs of identical twins; no marked resemblances in temperament 
were found. Hertz (549) and Suares (617) attempted to objectify scoring 
and provide further norms. Fosberg (528) reported that the reactions 
were stable under retesting, even with changed directions. Rorschach data 
from many investigators were summarized by Davidson and Klopfer 


(513). 


Interests and Attitudes 


Strong (614) published a new edition of his Vocational Interest Blank 
for Men. Kopas (562) developed a “point-tally” method for scoring the 
Strong blank. Using the Strong scores as a criterion, Estes and Horn (525) 
constructed two scales that would differentiate between interests in me- 
chanical and in electrical engineering. Carter and Jones (500) found the 
Strong scores to be closely related to high-school students’ vocational 
choices. Garrison (534) and Cleeton (505) developed new interest in- 
ventories. Davies (514) cautioned test workers against giving interests an 
“all-determinative” role in vocational choices. 
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In an extensive study of 3,758 students in four state universities and 
fourteen church colleges, Nelson (577) gave the Lentz C-R Opinionaire to 
determine the prevalence of radicalism. The mean scores tended toward 
conservatism; few radicals were found; seniors were less conservative than 
freshmen; women were more conservative than men; and small differences 
were found from school to school. Corey and Beery (507) concluded that 
liking for school subjects is closely related to liking for the instructor. 


Ratings 


Few, if any, unique contributions in the use of ratings came to the 
attention of the reviewers within the period covered here. One extensive 
study was that of Eells (522) who studied the best liked and least liked 
aspects of 200 secondary schools, securing 24,000 returns. Scales were 
formulated that grouped these aspects under such headings as school staff, 
curriculum, pupil activity program, and guidance. (For other studies on 
ratings, see references 409, 493, 504, 574.) 


Miscellaneous 


Bell (485) published an adult form of his adjustment inventory that in- 
cludes: (a) home adjustment, (b) health adjustment, (c) social adjust- 
ment, (d) emotional adjustment, and (e) occupational adjustment. Experi- 
mental studies of the following personality measures have been made, as 
follows: Willoughby (524), Baxter (484), Stanford M-F test (592), 
Woodworth-Cady .and Baker “Telling What I Do” test (643), and the 
Loofbourow-Keys Personal Index (591). 


Achievement 


Books, Reviews, and Monographs 


In the Review for December 1938, which summarized more than four 
hundred studies of educational tests and their uses, Scates (595) empha- 
sized the changing conception of measurement, shown particularly in the 
work of the Eight Year Evaluation Study. 

McCall’s Measurement (568), a revision of his early text, was con- 
spicuous for its shift toward the aims of the progressive education move- 
ment. Smith (607) devoted 182 pages to a critical examination of con- 
cepts of testing, a volume that has already proved very stimulating to 
those interested in the fundamentals of measurement. South (608) com- 
piled a glossary of terms used in measurement and guidance. Ruch and 
Segel (593) prepared a handbook for counselors, on the use of the indi- 
vidual inventory in guidance. Segel (598) also summarized the cumulative 
record systems of 177 school systems. 
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Test Technics 


Omitting for the present the Evaluation Study, several papers on test 
technics should be mentioned. May (571) wrote a very penetrating dis- 
cussion of the logic of measurement. Kuder and Richardson (563) and 
Remmers and Whisler (590) considered critically the concept of the re- 
liability coefficient, particularly its limitations. Kelley (558) showed that, 
under defined conditions, “upper and lower groups consisting of 27 per- 
cent from the extremes of the criterion score distribution are optimal 
for the study (of the validity) of test items.” Lev (567) used the method 
of analysis of variance to evaluate items and give them their proper 
weights. Guilford (539, 540) applied Fechner’s law to the scaling of 
test items, holding that the easiness of an item is proportional to the 
logarithm of the magnitude of the stimulus. Dunlap (520) found that 
two-response tests requiring underlining were more open to scoring errors 
than certain other response forms. 


Evaluation versus Measurement 


Although logically a part of the discussion of achievement testing, this 
concluding section of the present summary is set apart for the sake of 
emphasis. It is a greatly condensed treatment of a review involving 129 
titles. Space limitations have often necessitated the omission of authors’ 
names in the citations. 


The Philosophy and Function of Evaluation 


The change in terminology from “testing” and “measurement” to “ap- 
praisal” and “evaluation” was regarded by Hosic (552) as a significant 
development in education. Tyler (627, 628) held that the emphasis is not 
on the relative merits of tests, but on the extent to which evaluation instru- 
ments promote as well as measure important outcomes of instruction. 
According to this point of view the functions of evaluation are no different 
from those of the school as a whole, namely, to help provide more intelli- 
gent guidance of teaching and learning, to develop more effective curricu- 
lums and educative experience, to secure more intelligent and effective 
cooperation with parents and community, and to provide an adequate and 
objective basis for measuring, recording, and reporting progress that facili- 
tates the desired learning (576, 618). 

The hypotheses of evaluation—According to Wrightstone (646) the new 
point of view in evaluation is based on a number of hypotheses radically 
different from those of Thorndike. Curriculum change and evaluation are 
coordinate aspects of the educative process; a program of evaluation should 
be comprehensive; and present instruments are inadequate for many of 
the major objectives of education. Hence there is need for a variety of 
new means and technics for gathering evidence. The measures should 
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correspond to the functional units of pupil behavior in actual curriculum 
situations; reliable and valid objective instruments of measurement are 
restricted to an appraisal of limited aspects of pupil behavior; and meas- 
ures of functional behavior can best be developed by teachers working in 
cooperation with test technicians. 


The Nature of Desired Achievement in the School Subjects 


In view of these newer objectives of education the school subjects are 
expected to show correspondingly new kinds of results. Art should de- 
velop initiative, interest, judgment, and cooperation (537); the physical 
sciences, the ability to use experimental methods in gathering, organizing, 
and interpreting scientific data and in applying scientific facts and prin- 
ciples (483, 530, 531, 532, 541, 627); English literature, the reading 
of literature understandingly, a broader understanding of life, greater sen- 
sitivity to social problems, and increasing intelligence with regard to 
human motives and purposes (557); French and Latin, sufficient com- 
mand of French and Latin vocabularies for simple reading and speaking 
(647, 650); home economics, better health, and a happier home life for 
all members of the family (503, 553, 586, 587); mathematics, thorough- 
ness and precision in thought and action, disposition to question the 
validity of assumptions, expressed or implied, and sensitivity to the logic 
of arguments (545, 546); nursing, proper attitudes toward patients, other 
nurses, and physicians, and a wide range of interest not only in nursing but 
also in other and related fields, as well as the proper habits and skills in 
the performance of nursing activities (510, 626); social studies, sensi- 
tivity to and disposition and ability to deal with social problems in an 
intelligent manner, interest in international affairs and human welfare, 
and attitudes favorable to social improvement (479, 576, 589, 648, 651, 
652); health and physical education, physical fitness, lively curiosity, 
self-confidence, and quickness and decisiveness of movement (508, 649). 


Constructing Newer Achievement Examinations 


The steps in constructing achievement examinations, according to the 
foregoing point of view, are likewise different from the well-established 
technics of objective test construction. They may be summarized as fol- 
lows: (a) specifying the objectives of the school program as a whole; 
(b) restating, if necessary, each of these objectives in the light of the 
nature, characteristics, and requirements of the course, field, unit, or area 
in the school program that is to be evaluated; (c) defining the types of 
behavior that normally show whether or not and to what extent the 
objectives are being realized; (d) selecting test situations that will evoke 
the types of student behavior patterns consistent with the objectives; and 
(e) trying out these test situations with a view to improving their validity 
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and reliability and, at the same time, working toward making them more 
practicable (626, 627). 

Evaluation instruments—A number of new instruments have been con- 
structed both for appraising the school as a whole (542, 589, 644, 645, 646, 
651) and specific segments or areas of it. Of the latter type are those 
that bear the following titles: A Scale of Beliefs, Interpretation of Data, 
Familiarity with Sources of Data, Application of Principles of Thinking 
(several subject fields), Interest Index, Problems Relating to Proof in 
Mathematics, Literary Information Test (both English and American litera- 
ture), Questionnaire on Reading Interests and Reading Outcomes, Criti- 
cal Mindedness in the Reading of Fiction, Judging Effectiveness of Written 
Composition, Questionnaire on Voluntary Reading, Descriptive Test Pro- 
file, Evaluation of Reading, and a Checklist of Magazines (588). 

Developing a comprehensive program of evaluation—Tyler (576, 621) 
suggested ways and means of developing a program of evaluation that is 
both comprehensive and practicable “by making the appraisal an integral 
part of the learning process, by encouraging the pupil to make his own 
evaluation, by utilizing situations for evaluation which throw light upon 
the pupil’s development at those points where the collection of direct 
evidence is highly impracticable.” 

Needed research in evaluation—In order to develop a comprehensive 
program of evaluation, research is needed “in discovering types of be- 
havior which ought to be appraised, in devising means for appraising 
each important type of behavior, in refining appraisal instruments, in 
interpreting test results, and in follow-up studies regarding the permanence 


of learning” (625). Research in interest evaluation was also stressed 
by Weedon (633). 


Critical Evaluation of Evaluation 


Curiously enough, “evaluation” has been criticized on the same grounds 
as those on which it has criticized “measurement.” The following represent 
some of the negative comments on the work of Tyler, Wrightstone, and 
others. The technics may be valid, but “not adequate”; “failure .o supply 
clearcut data regarding the practices evaluated seriously weakens the 
scientific validity of the study”; “the study is subject to the usual lack 
of reliability, validity, and adequate sampling”; and “statistical sig- 
nificance alone does not prove educational significance” (494:271-72). 
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CHAPTER XII 
Rating Scales, Score-Cards, and Checklists’ 


LEO J. BRUECKNER 


Tae PURPOSE of rating scales, score cards, and checklists in educational 
research is to provide criteria useful in describing and evaluating some 
phase or element in the total learning situation. The following sections 
discuss a number of areas in which these instruments have been applied, 
including the appraisal of educational institutions and programs, descrip- 
tions of instructional practices, technics for evaluating curriculums and 
courses of study, methods of rating the staff and pupils, procedures for 
appraising materials of instruction, and methods of rating schoolbuildings. 
The materials reviewed are drawn largely from studies published within 
the last five years. References to summaries of earlier research are included 
when available. 


Means of Appraising Educational Institutions and Programs 


The past decade has witnessed the development of a series of checklists 
for describing and evaluating the elements of state programs of education: 
the characteristics of elementary schools, secondary schools, and colleges: 
and the provisions made by states for handicapped children. The first of 
these to appear was a comprehensive checklist for a self-survey of state 
school systems (677) published in 1930. It included checklists for studying 
the provisions made for the child and his welfare, the status of the teach- 
ing profession, state school finance, material equipment, and administra- 
tion and control. In 1932 Mort (673) published a scale for rating elemen- 
tary-school organization. Subsequently in 1937 Mort and Cornell (672) 
published a carefully constructed guide to be used by school systems to 
measure the extent to which local practices, related to curriculum, instruc- 
tion, and administration, were in accord with progressive practices found 
in the best school systems. The items in this checklist were selected on the 
basis that they clearly differentiated between practices of schools regarded 
as progressive and conservative. Provision was made in the checklist for 
giving evidence to support the evaluation made. 

The Cooperative Study of Secondary Schools in 1938 published a set 
of evaluative criteria in the form of comprehensive checklists for evalua- 
ting secondary schools (656, 657). These materials were the product of 
a group of specialists in secondary education. The blanks were tried out 
in a wide variety of schools and were found to be suggestive and stimu- 
lating. Their greatest value appears to grow out of their use for self- 
survey purposes. 


1Bibliography for this chapter begins on page 617. 
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Probably the most pretentious undertaking in the field of the develop- 
ment of checklists for evaluating institutions was the work of the Commis- 
sion on Higher Institutions of the North Central Association of Colleges 
and Secondary Schools (694). This commission undertook to study the 
validity of standards being used for determining admission to the Associa- 
tion, by securing quantitative information from a selected group of in- 
stitutions concerning many phases of their programs and then determining 
the relationship between these data and a ranking of the institutions on 
the basis of judged general merit. Profiles for each institution were pre- 
pared based on the items studied. The general conclusion was reached 
that there was little merit in the prevailing standards. It was also recom- 
mended that specific standards should not be set up as a basis for admission 
of institutions but that the Association encourage each institution to make 
a continuous study of its own program and that from time to time re- 
surveys be made of various standards to discover what the trends are. This 
investigation afforded an excellent illustration of the type of research that 
is needed to validate checklists and score cards of all kinds. Another 
example of this kind of study was the report by Crayton (658) of a series 
of standards in the form of checklists for evaluating the adequacy of the 
provisions made by a state for the education of various kinds of handi- 
capped children. Crayton’s checklist grew out of a survey and evaluation 
of best practices and legal provisions from all parts of the country. 


Descriptions of Instructional Practices 


The application of scientific procedures by supervisors has included the 
extensive use of rating scales and detailed checklists of all kinds to 
describe instructional practices. For example, Gray and Whipple (663) 
prepared a description of five levels of the teaching of reading which 
serve as a scale for rating the quality of the reading program. Feany (661), 
Brueckner (674:32-50), Otto (679), and others (653) reported the results 
of the extensive use of checklists to gather information about teaching prac- 
tices in such areas as social studies, arithmetic, and general education. 
Peik (680) published checklists to be used in the analysis and evaluation 
of recitations and units of work. In The Activity Movement (678) there 
was published a checklist for securing the judgments of persons concern- 
ing their points of view on a number of moot issues in education. Several 
comprehensive discussions of these kinds of procedures are available to 
which the reader is referred for further information (653, 676). 


Technies for Evaluating Curriculums and Courses of Study 


Several checklists have been prepared which provide excellent bases 
for evaluating curriculums and courses of study. The earliest of these by 
Stratemeyer and Bruner (685) has been widely applied. Bruner (655) 
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recently published a revision of this checklist. Harap (665) made a sur- 
vey of new courses of study on the basis of another checklist of items. 
Leary (670) also published a similar checklist. The value of these materials 
lies in the detailed analysis they made of items that should be considered 


by any committee at work on the preparation of a curriculum or course 
of study. 


Methods of Rating Personnel 


Several excellent discussions of the use of rating scales for appraising 
personnel have been published to which the reader is referred for informa- 
tion about the history of the movement (654, 682, 691). Towner (686) 
reported the results of an analysis of items included in rating scales for 
elementary-school principals. Smith (684) made an analysis of rating 
sheets used for rating student teachers. The inadequacy of most of the 
available teacher-rating scales has been well demonstrated by Sandiford 
(682) and a study edited by Walker (688). In both of these studies it 
was demonstrated that there is a very low correlation between currently 
accepted measures of teaching ability and ratings of teachers. The need 
for extensive research in the field of teacher rating is very great. 

A number of significant studies have been made of the use of rating 
scales and checklists in appraising the characteristics of the learner. One 
of the most important of these is that of Van Alstyne (687) who developed 
a scale for rating behavior of children in classrooms, using dependable 
statistical procedures. Pistor (681) developed a checklist for appraising 
pupil behavior in progressive schools. Eckert and Marshall (659) re- 
ported the results of the application of an inventory in the form of a 
checklist of the characteristics of pupils at the time they left the schools 
of New York. This information was used by the Regents’ Inquiry as one 
of the bases of evaluating the program of secondary education in that state. 
Zyve (695) and Flory (662) made significant recommendations of the 
kinds of information that should be secured as a means of studying the 
changes that take place in the learner. 


Appraising Materials of Instruction 


Rating scales and score cards of many kinds are used in evaluating 
textbooks and other kinds of materials of instruction. Whipple (693) 
reported the results of a study of the kinds of items included in a large 
number of rating scales used in many school systems in selecting textbooks. 
Her summary of items should be very helpful in preparing more adequate 
checklists of this kind. Gray and Leary (664) studied the validity of a 
large number of items that might be considered in determining the read- 
ability of a book. A number of checklists for rating textbooks in English 
(683), arithmetic (689), reading (690), and other subjects have also 
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appeared. The significance of these materials is the evidence that increas- 
ingly helpful efforts are being made to apply objective statistical technics in 
the selection of textbooks and other instructional materials. A series of check- 
lists for studying the selection and use of a wide variety of materials of in- 
struction was presented in the yearbook, Materials of Instruction (675). 
Wesley (692) published a checklist to be used as the basis of making a 
community survey. 


Rating School Buildings 


For many years score cards for rating school buildings have been 
used in school surveys. Three numbers of the Review of Educational Re- 
search have been devoted to the issues involved (666, 667, 668). The 
reader is referred to these numbers for detailed information. Two rela- 
tively new score cards include one by Engelhardt (660) for the elementary- 
school buildings and a general set of standards by Holy and Arnold (669). 
Special attention on the part of the reader is invited to the important 
study by Long (671) about the kinds of physical facilities teachers desire 
for carrying on activity programs. Long’s study suggested a procedure 
that may be used more widely to establish the validity of scales for rating 
buildings and equipment. It is unfortunately true that at present building 
score cards are largely the expressions of judgments of various individuals, 
unsupported by any competent evidence that the specifications which are 
set up are in fact adapted to the carrying on of effective educational pro- 
grams. 
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CHAPTER XIil 


Factor Analysis’ 


KARL J. HOLZINGER and HARRY H. HARMAN 


Since FACTOR ANALYSIS is a relatively new subject and only a very brief 
review of its literature (705) has appeared in previous issues of the Re. 
view of Educational Research, it seems desirable to start at the beginning. 
Only the most significant articles can be considered in the alloted space. 
Many of the early papers which contributed to methodology, but whose 
proposals have since been replaced by simpler and more efficient pro- 
cedures, have been omitted here. Greatest emphasis has been placed on 
the research of the last four years, but even for this period about half of 
the published articles had to be omitted. 


Two-Factor Theory 


The field of statistics known as factor analysis was founded by Spear- 
man (746) in 1904 when he argued that “all branches of intellectual activity 
have in common one fundamental function (or group of functions), 
whereas the remaining or specific elements of the activity seem in ever) 
case to be wholly different from that in all the others.” This paper was 
the first in a series which led to the formulation of the famous Theory of 
Two Factors and which culminated with his Abilities of Man (745) in 


1927. The first major test in practice of Spearman’s theory was under- 
taken by Burt (699) in 1909 when he studied the intercorrelations for two 
groups of Oxford schoolboys on twelve tests. 

In an important study, Brown and Stephenson (698) verified the 
Theory of Two Factors on an adequate statistical basis. This research was 
partially in answer to a critical article by Pearson and Moul (741) in 
which they suggested that “some 12 to 15 abilities . . . the abilities being 
settled by psychologists a priori to avoid ‘overlaps,’ are essential to a 
satisfactory test, the observations to be made on a homogeneous population 
of several hundreds.” In another application of the Two-Factor Theory. 
Webb (768) found a general factor on the side of character, closely re- 
lated to “persistence of motives,” in addition to the general intellective 
factor. 

The statistical adequacy of the Two-Factor Theory is furnished by the 
vanishing of all “tetrads” (745). Sampling errors for individual and 
average tetrads from a set of correlations were developed by Spearman 
and Holzinger (748). A number of critical papers relating to the validity 
of the tetrad criterion include those of Emmett (711), Heywood (717), 
Irwin (727, 728), Piaggio (742), and Wilson (769). 


1 Bibliography for this chapter begins on page 619. 
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In an important paper Garnett (713) in 1919 reviewed the literature 
on the Two-Factor Theory. He also presented the theory in a rigorous 
mathematical form including a geometric approach which has since been 
followed by such analysts as Thurstone (765). 


Bifactor Theories 


In more recent years, when psychologists started to use larger and more 
varied batteries of tests, they found that the tetrad criterion was generally 
not satisfied. Since this finding implies that a single general factor is not 
sufficient to account for the intercorrelations, a more elaborate theory was 
required. The bifactor theory as developed by Holzinger (719) postu- 
lated a general factor, a number of group factors identified with the 
mutually exclusive subsets of tests, and factors specific to each test. An 
elementary exposition of the bifactor method was presented in Student 
Manual of Factor Analysis (722). A number of applications of this method 
were reported under the sponsorship of the Unitary Traits Committee, 
of which E. L. Thorndike was chairman (719). A study indicating the 
stability of a bifactor solution was reported by Holzinger and Swine- 
ford (723). 

A set of standards for judging various factorial analyses was presented 
by Holzinger and Harman (720). In this paper they also showed the 
relationships between the bifactor solution and several of the multiple- 
factor types. These authors (718) made a practical comparison between 
the bifactor analysis and Thurstone’s earlier verbal description of an 
analysis which was later presented in his Primary Mental Abilities (764). 

A simple nonmathematical method of factor analysis which was pre- 
sented by Tryon (766) may be considered as essentially of the bifactor 
type. He grouped the tests into clusters and obtained final correlation 
profiles which reveal the essential nature of the underlying factors. 


Multiple-Factor Theories 


The earliest important contribution to the theory of multiple factor 
analysis was provided by Kelley in Cross-Roads in the Mind of Men (729). 
Using the tetrad criterion as a foundation, he developed more elaborate 
conditions for the exis’ .ce of varying numbers of common and specific 
factors. The ensuing analyses involved complex overlapping of these 
factors. 

In a later development of multiple-factor analysis, Thurstone (762, 
765) provided a method involving two stages. First a preliminary solution 
is obtained and then this is rotated to a form which he regards as psycho- 
logically meaningful. The standards postulated for the final form of 
solution preclude the existence of a general factor. Thus the multiple- 
factor and bifactor solutions are essentially different in form, but like all 
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factor solutions they may be converted from one to the other by suitable 
transformations (720). Since The Vectors of Mind (765), Thurstone has 
contributed a number of modifications to his method (761, 763). A 
simple variation of method is also furnished by Woodrow and Wilson 
(770). 

Two of the more important applications of multiple-factor analysis are 
those of Thurstone (763) and Mosier (740). Thurstone identified several 
“primary” mental abilities, while Mosier made an analysis of certain 
neurotic symptoms. Roff (743) and Dwyer (709) contributed papers deal- 
ing with the relation of multiple-factor analysis to aspects of classical 
statistical procedures. Horst (724) developed a method for describing a 
set of variables in terms of common factors such that all factors are de- 
termined simultaneously. 


Principal Component Theories 


Another variation of methods involving many common factors was sug- 
gested by Kelley and developed by Hotelling (725, 726) as the theory 
of principal components. These components are general factors, the first 
of which enters positively into all the variables, while the remaining ones 
have both positive and negative weights. Later Kelley (730) furnished a 
different statistical procedure but “the outcome is identical with that given 
by Hotelling’s method of analysis.” Lev (736), however, pointed out that 
the results of the two methods are in agreement only when the variances 
of all variables are the same. 

Several papers have appeared showing relationships between multiple- 
factor and component analyses. Girshick (714) re-examined the statistical 
bases of the theory of principal components and answered two major 
criticisms raised by Thurstone (765). Kellogg (732) showed that by 
suitable modification of Thurstone’s procedure the resulting technic 
would be identical with Hotelling’s. He also argued for the use of com- 
munalities in the diagonal elements of the correlation matrix with any 
method of analysis. 

Kelley and Krey (731) applied the method of principal components to 
the study of character traits in the field of social science. By a somewhat 
different type of analysis Burt (701) obtained several emotional factors 
of the component form. McCloy, Matheny, and Knott (738) obtained 
solutions of the multiple-factor and component form, and compared 
them. Thomson (757) showed that Hotelling’s method can be modified 
slightly to give Spearman’s solution for the case of a single general factor. 


General Statistical Contributions 


In contrast to the use of correlations between tests, which have ordinarily 
been employed as the basic data in factor analysis, Stephenson (749, 750) 
argued for the use of the correlations between persons. For example, in 
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his study of typology direct factors among persons instead of the person- 
ality traits were obtained (751). 

Thomson (756, 759) proposed a Sampling Theory of ability in which 
he regarded a number of factors as being a sample of all those employed 
by an individual in carrying out various mental tasks. Dodd (707) made 
a critical analysis of the Sampling Theory and compared it with Spear- 
man’s Two-Factor Theory. Mackie (739) furnished the probable value 
of the tetrad difference on the Sampling Theory. 

In factor analysis, linear descriptions of the tests in terms of the factors 
are ordinarily obtained. To determine an individual’s factors, then, an 
estimate of these in terms of the tests is necessary. Methods for estimation by 
means of regression equations have been presented by Thomson (760), 
Harman (716), and Ledermann (733). Bartlett (696) proposed an alter- 
nate method based upon the minimizing of specific factors. Papers by 
Dwyer (710) and Harman (715) were written on the subject of intro- 
ducing solutions for additional tests after the factorization had been com- 
pleted. 

A theoretical presentation of the criteria for determining the rank of 
the reduced correlational matrix was given by Ledermann (734). Another 
paper on this topic was presented by Young (771), who obtained an index 
of clustering to determine the number of common factors. Burt (703) 
has pointed out that “with a sufficient number of self-multiplications, 
any . . . table of correlation or covariances can be reduced as closely as 
we wish to a matrix of rank one, i. e., to a Spearman hierarchy.” 

Another topic, which is chiefly of theoretical importance, is concerned 
with the conditions determining the minimum number of tests in which 
a factor must be present. Such boundary conditions have been developed 
by Thompson (752, 753), and the probable errors of some of them have 
been obtained by Black (697). Ledermann (735) has given rigorous proofs 
of certain theorems involved in the boundary conditions which have been 
conveniently formulated by Thomson (754). 

In a cleverly written, stimulating paper Cureton (706) calls attention 
to some of the underlying assumptions made by the various approaches to 
factor analysis. 








CHAPTER XIV 
Index Numbers and Related Composites’ 


DOUGLAS E. SCATES 


Why Index Numbers? 


Eucators ARE CONCERNED with index numbers because (a) they are 
useful in reflecting trends in the financial aspects of public education; 
(b) they are useful in evaluating educational proficiency; and (c) they 
are available in abundance for measuring economic factors which affect 
the support of education, and which, when properly interpreted, should 
have some significance for the curriculum. 

An index number is a statistical technic for representing change in a 
variable when this change may be regarded as (or may be reflected in) the 
sum of changes in a number of more elemental variables. The elemental 
variables (referred to as criteria in evaluative index numbers) may be 
independent or correlated; may be homogeneous or heterogeneous (indi- 
vidually and as a group) ; may have weights which are constant or variable 
for each element; and are usually but a sampling of the total number of 
elementary variables known to compose or affect the general trait. Index 
numbers had their origin in economics with their purpose to reflect average 
changes in price, and they are still popularly thought of in that connection. 
They have been used in educational research to show changes in composite 
costs; to indicate weighted changes in quantities of cases; and to represent 
variations in quality. They may be employed to represent a composite 
change from time to time, from place to place, or from one set of conditions 
to another, among any group of observed variables. Their greatest service 
is in the measurement of a general character which is not susceptible of 
direct measurement, though they are not limited to this purpose. 

The technic is flexible, having the advantages over a multiple regression 
technic of not requiring a criterion (dependent variable), of accommodat- 
ing a large number of variables without undue labor, of permitting variable 
weights for each factor, and of requiring a theoretical minimum of only 
one (or two) cases, in the ordinary sense. On the other hand, these points 
are not always clear advantages. The fact that there is no criterion variable 
employed for index numbers necessitates that the weights and factors be 
selected with varying degrees of arbitrariness. It is possible that factor 


analysis may be available to reduce this arbitrariness in certain fields of 
application. 


What Constitutes an Index Number? 


The typical index number has certain rather definite characteristics: it 
is (a) a summation or average (b) of many elements, (c) these being a 
1 Bibliography for this chapter begins on page 622. 
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sample of a larger universe of elements, (d) each of the elements receiving 
a weight (e) which varies from observation to observation, (f) the values 
at any observation being expressed as a percent of the value for some 
observation point selected as the base, (g) with various types of formula 
rectification for bias. In practice we find variations which depart from this 
typical pattern by all possible degrees; each one of the characteristics 
individually is lacking in some one or another type of value which is called 
an “index” or “index number,” and usually several of the characters are 
lacking. The most typical structure, as above described, degenerates com- 
monly in one of four directions: (a) the value represents only a single 
trait, instead of a composite of traits, expressed as a ratio to a constant base; 
(b) the value represents a composite, but is not expressed as a ratio; 
(c) constant rather than variable weights are used; (d) the various traits 
or elements are not combined but are left separate. We shall recognize 
here types (a), (b), and (c) for the purpose of the present discussion; 
type (d) represents only the raw material for an index. While economic 
statisticians customarily take for granted the relative or ratio aspect of 
index numbers, educational and psychological statisticians are primarily 
interested in the fact that they are composites, and so are concerned with 
type (b); the detail of expressing these as ratios does not seem to be 
important. 

As for definitions, an extensive set can be found in the Kurtz-Edgerton 
Statistical Dictionary (818) which was just published. This volume defined 
nearly 100 index number terms, answering a multitude of questions which 
arise during reading and conversing about index numbers. Almost anyone 
can review with profit the definitions of even basic concepts, such as time 
reversal test, factor reversal test, circular test, link relative index number, 
chain index number, aggregative index number, splicing, and rectification. 
The less well-known terms also afford interest. Other definitions and descrip- 
tions will be found scattered through the literature; see especially (777). 


Previous General Treatises 


Fisher’s book (800) is still the principal work, although many new points 
are being emphasized in current literature. His book included two bibliog- 
raphies, of 20 and 39 references (p. 519-23), which open up the earlier 
literature. Good, Barr, and Scates (808:440-54) gave the most comprehen- 
sive discussion in educational sources, treating the common economic in- 
dexes, such as wholesale prices, retail prices, cost of living, and various 
business indicators; index numbers for school costs—supplies, building 
material, bond rates, teachers’ salaries, teachers’ costs of living, and general 
increases in school expenditures, and also the evaluative index numbers 
for rating school systems. Monroe and Engelhart (828) referred to about 
a dozen index numbers in education. Scates (852) discussed the general 
nature and applicability of index numbers for educational problems, citing 
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examples. These four treatises cover the pertinent literature before 1936 
and should be consulted for basic material and outstanding examples of 
index number series. With but few exceptions, the works cited in these 
treatises will not be repeated here. 


Ability, Effort, and Need in the Support of Education 


Studies of major import have been directed toward the measurement of 
the ability to support schools, the effort that is being put forth to support 
them, and the residual need in the support of public education, by states 
and other political units. In most cases these studies utilize some form 
of index number. Mort has been active in producing, directing, and stim- 
ulating studies in this area; we should mention his own studies on state 
support, of 1924, 1926, and 1933 (832, 834, 835), and on federal support. 
of 1934 and 1936 (831, 833), in addition to several studies of individual 
states. Some of this work rests on a regression equation; other parts 
embody characteristic index number technics. The National Education As- 
sociation’s Research Division in 1926 (837), and in 1937 Norton and 
Norton (845), reported indexes of financial ability, financial provisions, 
and adequacy of educational program for the forty-eight states. The 1926 
report is regarded as “the pioneer study in the area of the relative ability 
of the states to support education.” Newcomer (843), in 1935 and Chism 
(781), in 1936 prepared indexes of the ability of individual states to 
finance education. 

The Research Division of the National Education Association utilized 
the findings of Newcomer and Chism, in addition to other data, in a study 
of the needs and the effort being put forth by the different states to meet 
their needs (838). Chapter II of that report reviewed previous research on 
indexing the efforts of local communities and states to support education, 
as well as research on the adequacy of educational programs in com- 
munities and states. A bibliography was given at the end of the bulletin 
and in footnotes. Studies of ability of local districts were made by Cornell 
(785) and by Overn and Knapp (846). Most of these studies have been 
covered in some detail in the Review for April 1938. 

While a clearcut form of index number is not in evidence in some of 
these studies the function of the raw data employed is a weighted summa- 
tion of elementary variables, frequently referred to a temporal or geo- 
graphical base, and all the studies could be cast into the form of an index 
number formula. They are therefore important examples of the type of 
measurement for which this technic is adapted. 


Representing and Analyzing Increasing Costs 


In ascertaining the factors which have given rise to increasing school 
costs from 1914 to 1930 the Research Division of the National Education 
Association (842) prepared detailed index numbers for eleven different 
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classes of school expenditures and then combined these to form a general 
index number of public school costs. This index number increased from 
100 to 171 in the sixteen-year period; the reciprocal of this series was 
used to give the purchasing power of the school dollar, which declined 
from 100 to 59. 

As a second factor in the increasing cost the Research Division estimated 
the increase in educational load. Account was taken of the increasing 
proportion of high-school pupils by converting them into equivalent ele-¢ 
mentary-school pupils on the basis of relative cost. This procedure is the 
logical equivalent of employing a quantity index number. Scates and 
Baetz (854) calculated a quantity index number for the public schools of 
Cincinnati and derived from it a unit cost trend unaffected by the changing 
composition of the school population. This use of quantity index numbers 
represents a technic for the elimination of undesired factors. So far it has 
not been exploited. 

A number of cost-of-living indexes for teachers were prepared before 
1936 and are reviewed in the sources mentioned at the outset (808:449). 
Most of these were summarized in a National Education Association 
Research Bulletin (841). One of them was examined experimentally in a 
study of the effect of weights (851). The series of index numbers by Clark 
and others on school bonds, school supplies, and schoolhouse construction 
were also reviewed previously (808). 


Index Numbers for Rating State School Systems 


Another large purpose for which index numbers are used in education is 
the evaluation of schools and school systems. Perhaps the most pretentious 
of these indexes have been the ratings of state school systems, which began 
with Ayres’ work in 1912. The earlier ones have been covered in sources 
previously mentioned (808:441-42). A National Education Association 
study (839) should be cited again because of its excellent review of earlier 
works; a later bulletin (838) mentioned a number of the studies also. 
The National Education Association (839) advanced five criteria for rating 
states, but did not combine them; Scates and Fauntleroy (851) tried sev- 
eral sets of weights on these series, resulting in an index number series 
for the states based on equal weights and one based on natural weights, 
in addition to other series of less immediate interest. These data represented 
approximately the year 1930. Scates also recalculated the index number 
of Schrammel and Sonnenberg because it was based on provisional data 
and contained errors; this revision (853) brought the index number data 
for the states up to 1934. Furney (805, 806) used basic data for 1936, 
offering two separate index numbers—the first consisting of five non- 
financial factors and the second consisting of five financial factors. He 
found a correlation of .82 between his two index number series. (Ayres, 
1920, reported a correlation of .78 between his financial and nonfinancial 
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factors.) Furney analyzed his index numbers according to the density of 
population; Phillips also did this with his 1930 index. Jobe (813), using 
five criteria, computed index numbers for ranking the states for each of 
the years 1926, 1928, 1930, 1932, 1934, and 1936. He made special com- 
parisons among sixteen southern states. For 1936 he included two other 
index numbers, one based on actual (raw) data and one on six criteria. 

The correlation of .8 between financial and nonfinancial criteria is 
significant in the light of criticism sometimes brought against the index 
numbers, namely, that they are too heavily weighted with financial items 
which do not in themselves represent merit. The correlation found suggests 
that the financial items are not being given undue weight; for the most 
part they agree with the nonfinancial items, and for the extent to which 
they differ from the nonfinancial items there is more reason for assuming 
that the major portion of this difference is in the direction of merit (repre- 
senting phases of merit not reflected in the nonfinancial factors) than there 
is for assuming that the major portion will be in the opposite direction. 


The Rating of Selected Groups of Schools 


The Cooperative Study of Secondary School Standards, sponsored by 
six regional accrediting associations of colleges and secondary schools, 
developed a series of rating procedures. The total score on the nine sum- 
mary evaluative criteria and the grand total score on the seven different 
kinds of measurement both carried the structure of an evaluative index 
number (784, 797). The Committee gave large attention to the selection 
of its criteria and has recognized the importance of the effect of these 
criteria upon the growth efforts of the schools, as well as their service 
in measurement. By comparison, the criteria employed in the studies which 
have produced index numbers of state school systems appeared super- 
ficial and mechanical; it should be said however that the nationwide 
evaluations included elementary schools as well as high schools, and they 
had to depend upon official figures which were available for all the 
states. The accrediting associations were concerned only with schools 
coming under their jurisdiction, from which they secured the desired 
facts. The work of this Committee may be traced through references given 
in the Education Index under the name of the committee. 

The North Central Association of Secondary Schools and Colleges has 
done a similar piece of work for the purpose of determining accreditation 
of colleges and universities. The study was published in seven volumes which 
are reviewed briefly by Sears (857). The report evidenced unusual care 
and extensive study in the selection of the criteria. The percentile ratings 
which were given on eighty-one separate items, grouped into eleven major 
divisions, could be added together, the total having the structure of an 
equally weighted index number. It was evident, however, that the Com- 
mittee was not primarily concerned with a total score but was interested 
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in the profile of ratings and desired to judge the entire pattern as a pat- 
tern and not as a mechanical sum of its elements. Further, each profile 
pattern was to be judged in the light of the declared objectives of each in- 
stitution. To those who are interested primarily in quantitative technics 
this conclusion Offered a challenge. Granting that the profile will always 
serve certain ends which a total cannot, we may raise a question as to 
whether the final determination of acceptability could not be accomplished 
through the application of the index number technic without doing violence 
to the purposes of the Committee. Such an index number would have to 
utilize selective groupings of traits and variable weights (perhaps weights 
which varied with different ratings on a given trait)—the groupings and 
the weights to be ascertained from an analysis of the discussions and con- 
clusions of judges when at work determining accreditation, so that the 
values and the degrees of compensation could be determined. This problem 
has not yet been worked out in the field of rating, but its delicacy and com- 
plexity are not beyond the possibility of methodological treatment. The 
underlying problem is essentially the same as that of the cost-of-living 
index. In our evaluative index numbers, and in rating scales, we have re- 
mained too contentedly in the primitive stage of constant weights. 

Foster (801) produced a composite rating of graduate schools based on 
twenty-eight criteria. Eells (796) objected to the natural weighting used 
by Foster and recalculated the index, giving equal weight to each of the 
twenty-eight traits. The change of weighting changed the position of sev- 
eral institutions. An index number consisting of eight elements for rating 
both private and public higher institutions in states was prepared by 
Chamberlain and Meece (780). Hoffman (811) proposed one for sec- 
ondary schools. 


Other Educational Applications 


Dwyer (794) sought a measure of college teaching load which would 
combine various aspects of load and would be based on objective data. 
He found that three elements, namely, the number of different credit hours 
taught, the number of teaching hours, and the total enrolment, equally 
weighted, gave nearly the same results as an index number composed of 
sixteen elements. The figures shown in the report were, however, averages 
for departments, or other sizable groups; the finding might not hold for 
individual instructors. Dwyer’s conclusions agreed closely with the recom- 
mendations of Reeves and Russell (849) of ten years ago. The formula 
of Douglass (793) on teaching load in secondary schools becomes an index 
number when the traits are expressed in appropriate units and properly 
subdivided. Boardman (776) reported on the use of an expanded form of 
Douglass’ formula. Overn (847) calculated a “single-factor” type of index 
for teacher demand for the state of Minnesota. The National Education 
Association (840) calculated a group-to-group index number of educa- 
tional salaries for 1938-39. 
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United States Bureau of Labor Statistics Indexes 


Several index number series of the Bureau of Labor Statistics—which 
are probably the most widely used basic index numbers in the country— 
have undergone revision during the past four years. Weights for the cost. 
of-living index were heretofore based on “the spending of families of wage- 
earners and lower-salaried workers, as shown by the Bureau’s study of 
the expenditures of 12,096 families in 1917-19” (see the Journal of the 
American Statistical Association 31:610 for references to this study and 
the analyses of Osburn). The field work of a new nationwide study of ex. 
penditures was completed in 1936 as a basis for revising these weights so 
that they would “more nearly approximate presentday consumption. . 
Pending this basic revision in weights, several important revisions in 
method have been incorporated in the indexes beginning with the March 
15, 1935 period, and the . . . indexes have been revised back to the base 
years” (863:2). The changes were described in an earlier article (868) ; 
for preceding history see references cited in (808:445). The new weights 
derived from the 1934-36 study of expenditure patterns of groups in fifty- 
five cities were used for the first time in the June 1939 cost-of-living index. 
The data of this extensive study are now beginning to be published (see 
Journal of the American Statistical Association 34:378 for an outline of 
plans). A general discussion of the study, with data from various cities, 
appeared in the Monthly Labor Review from time to time (830). An even 
more extensive study of expenditures for families on all income levels in 
thirty-two cities was undertaken by the Bureau of Labor Statistics and the 
Bureau of Home Economics in cooperation with the National Resources 
Committee and the Central Statistical Board (779, 815, 827, 855, 867). 
The base period of certain of the index series also may be changed (850). 

The Bureau of Labor Statistics wholesale price index underwent a change 
in computational procedure in 1937 (790) as part of a general reconsidera- 
tion of the series. Begun in 1890, the index number was calculated from 
1908 through 1936 as a chain index; since January 1937 it has been com- 
puted on a fixed base. Reasons for the change in the base, as well as ref- 
erences to earlier indexes, were given in the reference cited. The formula 
remains that of Laspeyres (Fisher’s No. 53) with weights revised from 
time to time. 

The retail price of foods index has been receiving its share of attention. 
The current methods of collecting data, and the forming of a Conference on 
Price Research in November 1935 to coordinate the interests and insights 
of various groups, were described in the Monthly Labor Review for January 
1936 (p. 253-54) and March 1936 (p. 836-37). To follow the work of the 
Bureau on these and other indexes one should consult files of the maga- 
zine just mentioned, and the notes of the Bureau of Labor Statistics in 
current issues of the Journal of the American Statistical Association. 
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Price Levels, Cost of Living, and Purchasing Power 


The changes in the cost-of-living index referred to in the preceding sec- 
tion are a part of a general ferment in this area of index numbers. Surging 
with the zest of mathematical exploration and development of an under- 
lying principle, activity on the theoretical side of cost-of-living indexes has 
been notable during the past few years. The discussion grows out of the 
influence of a psychological principle in consumption, leading to the fact 
that a cost-of-living index is not so much concerned with reflecting changes 
in the cost of a fixed set of items as it is in reflecting changes in the cost 
of a given standard of living—a general level of satisfaction. As prices 
change for a given income, or as income changes out of proportion to 
prices, individuals will seek to obtain, or to maintain, as large an amount 
of satisfaction as possible under the new conditions—which means that 
the pattern of expenditures (the ratios between the quantities of different 
items which are purchased) will change. The individual may in fact buy 
many entirely different things. This condition makes cost-of-living index 
numbers which are based on exactly the same items from time to time 
somewhat academic and artificial. 

In order to obtain a cost-of-living index number (or various series of 
indexes) for different income groups, and in order to modify properly the 
quantity weights in these indexes when prices change, it is necessary to 
know or estimate how the patterns vary under the influence of three prin- 
cipal factors—prices, income, and size and composition of family. Or, 
from another angle, it is necessary to know what the conditions are which 
determine when two different expenditure patterns represent the same 
standards of living, or of satisfaction. 

A large number of field surveys, in addition to the two large undertak- 
ings mentioned in the preceding section, have been made to study family 
expenditures (772, 869, 870). It is the mathematical analysis of these data 
to determine the interrelationships mentioned that has produced the arti- 
cles above referred to. The analysis involves concepts of Engel curves, 
Koniis inequalities, indifference curves and surfaces, indicators, and the 
modern mathematical theory of utility, exchange, and demand. The ap- 
proaches and interpretations are not all in agreement, and whether a con- 
clusive solution has been reached remains to be seen. The discussion has 
appeared principally in Econometrica and in the annual proceedings of the 
Cowles Commission; one may consult these seriatim, or trace the principal 
arguments in the following references by Frisch (802, 804), Bowley (778), 
Menderhausen (824), Wald (864), Schultz (856), Koniis (817), Allen 
and Bowley (772); also in papers in the Report of the third conference of 
the Cowles Commission (788) by Frisch; and in the Report of the fourth 
conference (787), papers by Petersen, Wald, and Allen. 

Ferger (799) contended that an index number for measuring purchas- 
ing power should be a harmonic average of costs; his paper is discussed 


by Lewis (820). 
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General Economic and Business Indexes 


Lists and indexes of series—Davenport and Scott (791) in 1937 pre- 
pared an index to more than 200 business series, with a general description 
of each and the name of the publication in which it may be found. The 
National Association of Purchasing Agents (836) in 1937 issued a list 
including a brief discussion of thirteen leading commodity index number 
series. Data back to 1913 were given and graphed for each index series. 
Black and Mudgett (775) listed a number of indexes related to agriculture. 
Croxton and Cowden (789) referred to a number of series; Fisher (800: 
432-38) listed 85 discontinued series, 99 current series in foreign countries. 
and 34 current series in the United States. Citations will be found in these 
sources to other sources and to earlier lists. See also the following para- 
graph. 

Important new series—Cowles and associates (786) prepared a notable 
set of stock indexes, running back to 1871. Cole (783) supervised work on 
wholesale prices running back to the beginning of the eighteenth century 
—as part of the work of the International Scientific Committee on Price 
History. Hickernell (810) also produced some series of historical value— 
1815-60. Johnson (814) reported a new index on physical volume of busi- 
ness to replace the former indexes of volume of trade. Many other series 
of more or less recent origin will be found in the lists cited in the preced- 
ing paragraph. 

Discussions of uses, values, and shortcomings—Without attempting to 
list all the recent articles on economic index numbers we may cite a num- 
ber of them which throw light on practical applications and interpretations 
of these series. These discussions are of value to those educators who are 
concerned with the economic background of education and educational 
support—and also to those who wish light on the interpretation of index 
numbers, whether economic or not (782, 792, 819, 823, 825, 861, 865. 
866). Other articles, some more theoretical, will be found in issues of the 
Journal of the American Statistical Association. 


The Construction of Index Numbers 


Methods used in preparing current series—Descriptions of the method 
of preparing a number of series of indexes prior to 1936 were cited (808: 
footnotes p. 226-29, and p. 449-53). Actual and contemplated changes in the 
preparation of the Bureau of Labor Statistics indexes have been presented 
in earlier sections. A full description of the methods used by the National 
Industrial Conference Board in its cost-of-living index was given by Bene) 
(774). McIntyre (821) described a tabulating machine procedure. 

Discussions of problems of construction—The classic discussion of prob- 
lems and methods of preparing index numbers, by Mitchell, first published 
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in 1915, revised in 1921, and long since out of print, has been republished 
(826). We quote one passage as a contribution to perspective: 

To judge from the literature about index numbers, one would think that the 
dificult and important problems concern methods of weighting and averaging. But 
those who are practically concerned with the whole process of making an index 


number from start to finish rate this office work lightly in comparison with the field 
work of getting the original data. (826: 25). 


Other discussions of problems in index number construction are given by 
Black and Mudgett (775), Hudson (812), and Perlman (848). Mont- 
gomery (829) wrote on “the mathematical problem”; reviewers have 
found it difficult to sense his problem. 

Sampling—tThis topic, which appears incidentally in many discussions 
of method and of use of index numbers, has been dealt with explicitly by 
Hanna (809), Hudson (812), Neyman (844), and Schoenberg and Parten 
(855). It is treated, whether by this or some other title, in most descriptions 
of field work in gathering data for current index numbers; the selection 
of cities, of outlets or sources, and of individual items are all practical 
phases of sampling (826). 

Adaptation of formula to tests and to purposes—Fisher (800: 229-34, 
523-31) selected an “ideal” formula on the basis of certain formal tests 
and contended that it was ideal for all purposes. Apparently, however, he 
was thinking only of a certain limited group of purposes; other writers 
have differed sharply with his contention, and have enumerated other pur- 
poses which clearly call for different index number formulas; see Mc- 
Intyre (822), Mitchell (826: 23-25), and Black and Mudgett (775). King 
(816: 55) regarded all formal tests as inconsequential. Frisch (803) devel- 
oped a formula which would meet certain formal tests, and pointed out 
certain incongruities between the tests. Smith (858) analyzed an index 
number series in terms of related factors in an attempt to check its validity. 


Weighting Index Numbers 


Discussions—The question of weighting is an old one and has been 
debated at length. The type of formula used controls certain aspects of 
the weighting: see Arthur (773), Evans (798), and Garver (807). Fisher’s 
selection of the “ideal” formula because of its internal weighting relation- 
ships is well known. The more general problem of weighting concerns the 
weights to be assigned to elements at any given time or observation point: 
see Mitchell (826: 59-68), Black and Mudgett (775), and Johnson (814). 
The discussions of weighting for cost-of-living index numbers, and the 
studies of expenditure patterns, already cited, are pertinent here. 

Practical experiments—As to the importance of weighting of elements, 
experience as well as opinion differs. Allen and Bowley (772) showed a 
variation of 16 percent in cost-of-living index numbers when different 


weights were used; the Bureau of Labor Statistics finds only 2 percent 
variation. 
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To ascertain experimentally the effect of weights on index numbers which 
have been used in educational work, Scates and Fauntleroy (851) applied 
arbitrary weights ranging from 1 to 11, and rotated these weights between 
the criteria according to a fixed scheme so as to see what effect the chang. 
ing of the weights would have on the resulting index numbers. The weights 
were applied to five series proposed by the National Education Association, 
using both the actual and the rank value forms of the data; as a third study, 
the weights were applied to 1] series used by Schrammel and Sonnenberg; 
as a fourth study, they were applied to 15 traits used by Chamberlain; and 
as a fifth study, they were applied to 8 elements of a teacher cost-of-living 
index. The results were analyzed both in terms of correlation and in terms 
of rank or actual displacement in the resulting index numbers. Average 
correlations for the first four studies ran .97, .95, .86, .95; average dis- 
placements in rank (the series having 48 ranks) were 2.5, 3.0, 4.8, and 4.1. 
For the cost-of-living index number, the average difference was 1.4 percent 
—the maximum difference observed being 6.5 percent. The conclusions 
were that weighting is sometimes important, though generally not; it is of 
most effect when the extreme (high or low) weight is applied to a series 
which is highly unique, but when an extreme weight is applied to a series 
having a high correlation with the remaining series the weight has little 
effect. The factor of weighting cannot be considered by itself; it is con- 
nected with the uniqueness of the various traits. 

Formal criteria—proposals have been advanced by Horst, and by Wilkes 
(both discussed in Chapter XV of the present issue), and by Edgerton and 
Kolbe (795) for determining the weights of composites according to speci- 
fied “internal” or statistical criteria. Whether these technics afford answers 
to some of the problems of weighting in index numbers remains to be seen. 
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CHAPTER XV 
Statistical Methods’ 


PALMER 0. JOHNSON 


Tue EXTENSIVE REVIEW by Cureton and Dunlap of statistical literature 
having special application to test construction and analysis, in the June 
1938 Review of Educational Research, supplemented by the bibliography 
of Scates in the December 1938 issue, and the treatment of factor analysis 
and of index numbers in the chapters immediately preceding this one, 
permit delimitation of the present review. Moreover, the time available 
and the space allotted to the topic preclude an exhaustive and thorough 
discussion. It has however seemed advantageous to include a number of 
pertinent developments in other fields than education. This proved to be 
difficult, as the selection of material having special significance for research 
workers in education, out of the vast amount of statistical literature which 
is so rapidly developing, cannot avoid being somewhat arbitrary. 

The main role of statistical analysis in research may be specified as: 
(a) providing a secure basis for the planning of the investigation, (b) 
affording appropriate tests of significance to determine the existence of 
a real effect, (c) providing methods of obtaining the best unbiased esti- 
mate of the effect found to exist, and (d) furnishing efficient means for the 
reduction of data. 

The studies cited have been classified under the following headings, 
not necessarily mutually exclusive: (a) statistical methods and the plan- 
ning of investigations, (b) analysis of variance and covariance, (c) cor- 
relation and regression, (d) the testing of statistical hypotheses, and (e) 
the problems of sampling. 


Bibliographies and Books of Recent Statistics 


Among helpful bibliographies or studies including bibliographies 
should be mentioned those of Buros (879), Dunlap (893), Rider (932, 
933), Preston (930), Shewhart (939), and Swineford and Holzinger 
(948). 

Significant publications on general mathematical foundations of statis- 
tical theory and practice were those of Deming and Birge (891), Fisher 
(898), and Wilks (959). 

Statistical tables of recent publication that should prove of substantial 
value to research workers were those of Fisher and Yates (899), and 
Kelley (919). The tables by David (889) brought together the funda- 
mental treatises on the tests of significance for r and greatly facilitated 
the testing of any hypothesis as to the magnitude of correlation values. 


1 Bibliography for this chapter begins on page 626. 
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Statistical Methods and the Planning of Investigations 


The development of experimental technics is closely related to the ques. 
tion of efficiency—with how much information it is possible to make the 
experiment yield. Experimental work may be carried out in a number 
of ways, but an efficient technic is based upon a knowledge of fundamental 
principles, of the methods available, and familiarity with the sources of 
variation in the experimental material. 

Due chiefly to the contributions of R. A. Fisher and his students, the 
principles of experimentation represented in agricultural experiments are 
more highly developed than those in any other field. One of the most sig- 
nificant principles is that of regarding statistical analysis and the design of 
the experiment as but two aspects of the same problem. Research workers 
in education could profit much by observing this principle; in general 
we have not been critical in the fulfillment of the conditions which render 
our data amenable to extended statistical analysis. The best single treat- 
ment of the principles of experimentation was given by Fisher (896). 
Of especial import is the concept of the self-contained experiment and the 
rigorous discussion of the rationale underlying the requirements of a self- 
contained experiment. The objective of making an experiment self-con- 
tained is to provide a validated, unambiguous interpretation of the experi- 
mental results without reference to other experiments or to previously 
accumulated experiences. The need for supplying a control and the neces- 
sity for the experiment containing within itself the provision of a valid 
estimate of experimental errors, as well as an unbiased comparison be- 
tween the factors tested, are the fundamental aspects of the principle of 
making an experiment self-contained. The role of randomization in experi- 
mental design is succinctly treated, which should correct the rather com- 
mon misunderstanding that the purpose of randomization is to increase 
the precision of the experiment. On the contrary, the aim of randomiza- 
tion is to guarantee that whatever precision the experimental arrangement 
can provide is neither overestimated nor underestimated. 

Particularly suggestive as forms of experimental design are the incomplete 
randomized blocks and the factorial experiments, discussed by Yates (965, 
967) and Fisher (896). These principles of design are of general utility 
whenever dealing with variable materials—for example, comparing the 
effects of different dietary treatments on school children; studying nature 
and nurture through the use of monozygotic and dizygotic pairs of twins; 
ascertaining certain factors underlying growth in learning; and the like. The 
limited utility and the doubtful validity of the so-called “law of the single 
variable” in dealing with biological materials—a technic which receives 
much attention in theoretical discussions of the scientific method—are 
well brought out when contrasted with these efficient methods of design. 
The combining of several single lines of inquiry into a single large fac- 
torial experiment, where the factors are varied concurrently in the different 
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possible combinations, is a decidedly more efficient form of arrangement 
than the customary method of using one factor as a control. In the factorial 
type of design, all treatment comparisons are of the same accuracy. 
Moreover, this accuracy is the same as that between the control and the 
other factors in the customary type of arrangement with an equal number 
of experimental units. The possibilities of these forms of design in render- 
ing unnecessary the application of partial and multiple correlation technics 
frequently applied to similar problems are promising. 

Johnson (917) discussed and illustrated the relations between statistical 
methods and the design of experiments in education and psychology. 
Snedecor (942) brought out the relationship between efficient experi- 
mental design and improvement of statistical technics in biology. Fisher 
and Yates (899) illustrated the application of the incomplete randomized 
block form of arrangement to data taken from Crew’s study (886) of the 
inheritance of educability. Crutchfield (888) pointed out the possibilities 
of the application of factorial design to psychological research and has 
applied this form of design in the study of five factors as determiners of 
energy expenditures in string-pulling of the rat (887). Chapin (882) 
treated the problems in the design of social experiments and educed a 
pattern of practical procedure which will be found useful to workers 
in education. Peters (929) pointed out that certain features, such as 
extended measurement, the use of tests valid for measuring the presumed 
differences between the groups under comparison, and the repetition of 
experiments, would increase the reliability in controlled experiments. 


Analysis of Variance and Covariance 


The uniformly most powerful tool for research workers is probably the 
analysis of variance, developed by Fisher (896). Fisher described this 
technic as “the arithmetical procedure by means of which the results of 
an experiment may be arranged and presented in a single compact table 
which shows both the structure of the experiment and the relevant results 
in such a way as to facilitate the necessary tests of significance.” It provides 
the mechanism by which the total variance in a record of observations 
may be broken down into parts traceable to specified sources. It is un- 
fortunate that such a valuable technic has been so long neglected in the 
field of educational research, for it has rather wide application. It may 
be applied, for example, in testing out the homogeneity of any number 
of samples with respect to measures such as intelligence, achievement, 
and the like. 

An excellent illustration of the power of the analysis of variance was 
afforded by its application by Fisher and Gray (897) to the examination 
of the value and reliability of the data from Boas’s study on Changes in the 
Bodily Form of the Descendants of Immigrants. Lev (923) applied the 
method to the evaluation of test items of the multiple-choice type. 
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The analysis of variance, with the use of the Z-test is perhaps the sim. 
plest test to use in examining the reliability of regression coefficients and 
correlation—both simple and multiple—as well as for testing linearity 
of regression. 

The method of covariance described by Fisher in 1932, by which cor- 
rections can be made in observational data for variations in one or more 
correlated variables, provided a technic for dealing with the problem 
of covariation in the case of heterogeneous data similar to the single vari- 
able problems treated by the mechanism of the analysis of variance. The 
introduction of the analysis of covariance provided a means of further 
control of errors that arise in experiment and are usually incapable of 
control. One interesting application of the analysis of covariance was made 
by Snedecor (940) in an endeavor to institute a statistical control of a 
departmental grading system for students in mathematics. Increased pre- 
cision resulting from the application of the analysis of covariance was 
also demonstrated by Wishart (961) in his growth-rate determination in 
nutrition studies. Wishart (962) also dealt with the problem of the cal- 
culation of the standard errors for means when adjusted for regression. 

Fertig (895) pointed out what had previously been stated by Fisher 
(898), that where the two samples of observations are paired the method 
of differences may be employed. When two samples are independent. 
more degrees of freedom are available for estimating variance, so that in 
paired observations the basic characters of matching must be sufficientl; 
highly positively correlated in order to offset the loss in precision resulting 
from estimating the variance from the reduced number of degrees of free- 
dom. Barr and Mills (872) presented a short method of calculating the 
standard error of the difference of the means of paired items. This method 
is the method of differences and, based on the same number of degrees of 
freedom, gives identical results obtained from the more laborious method 
of application of the standard error of the difference between means of 
correlated measures. Welch (956) formulated a test of the significance of 
the difference between two means when the population variances are 
unequal. 

Certain theoretical conditions underly the application of the analysis 
of variance. The experimental errors to which the observations are sub- 
ject should be independently and normally distributed, with the same 
variance. Means and standard deviations of random samples from a norma! 
homogeneous population are known to be independent. Immer (911) 
found that a negative correlation appeared to exist between the mean 
yield of plots and the standard deviation of samples within these plots for 
uniformity of trial data. The condition of uniform variance is a more im- 
portant restriction than that of normality. Cochran (883) showed that for 
certain types of nonnormal data it is possible to make certain transforma- 
tions by which skew distributions may be changed into distributions 
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which are approximately normal with the same variances; among the 
transformations illustrated, the logarithmic transformation proves useful 
in dealing with new material of unknown distribution. This may be the 
case in dealing with certain kinds of educational and psychological data. 
Cochran (883) illustrated how the logarithmic transformation could be 
applied to a set of reaction time data when it was observed that the stand- 
ard errors of the original data were proportional to the means. Bartlett 
(874) discussed the advantages in using the square root transformation 
in the analysis of variance as a means of stabilizing the variance when the 
variance is proportional to the mean. 

Friedman (901) proposed the use of ranks when the assumptions neces- 
sary for the valid application of the analysis of variance are not justified. 
Problems of this nature are rather often encountered in dealing with 
social and economic data. In his method each set of values of the variate 
were arranged in order of size, and the ranks were used in the analysis 
in place of the original quantitative values. The fundamental step in the 
analysis was the computation of X*r (chi-square) from the table of ranks. 
In this method it was impossible to obtain a measure of interaction, but the 
method proposed by Fisher of pooling the probability values of inde- 
pendent tests of significance enabled the use of more of the relevant in- 
formation provided by the data. 

It will occur quite often in dealing with problems in education which 
require the testing of homogeneity of a number of samples that the samples 
are of unequal numbers. In the simplest case of analysis of variance where 
the variates are grouped according to a single criterion, no difficulty is 
encountered in the analysis. In dealing with multiple classifications with 
unequal frequencies in each class, especially if the subclass numbers are 
disproportionate, real difficulties arise. Snedecor and Cox (941) presented 
the methods available for analysis of disproportionate subclass numbers 
in tables of multiple classification: (a) the method of expected subclass 
numbers, (b) the method of fitting constants, (c) the method of unweighted 
means, and (d) the method of weighted squares of means. 

For a discussion of the theoretical foundations underlying the analysis of 


variance, publications by Irwin (913, 914) and Hendricks (904) will 
be found useful. 


Correlation and Regression 


The correlation coefficient continues as a much overused tool in edu- 
cational research, likely because of unfamiliarity with other available 
tools. Burt (880), in a discussion of recent developments of statistical 
methods in psychology, indicated that the interest of the statistical psy- 
chologist is shifting from ascertaining the degree of correlation to the 
analysis of variance; also that owing to the impression of many that sta- 
tistical methods could not be applied unless samples were large, the more 
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accurate and individual studies of the past have been supplanted by vast 
collections of inaccurate data through the application of group tests. With 
the development of small sample technics there is likely to be a return 
to the more accurate and individual studies of the past. Irwin (912) ex. 
pressed the desirability of summarizing results in a concrete form rather 
than by a single coefficient. The regression line or curve is more informa- 
tive than the correlation coefficient or the correlation ratio when the prob- 
lem consists in relating two quantitatively measurable variables. The rela- 
tion between a quantitative and a qualitative variable is more concretely 
expressed by comparing the variation between arrays with the variation 
within arrays by means of the analysis of variance technic rather than by 
the correlation ratio. When both variables are qualitative, the difference 
between observed and expected frequencies in the different cells of the 
contingency table is frequently more informative than the single coefficient 
of mean-square contingency. 

The many new ways of expressing relationships appear in most cases 
to be no more than the older method of least squares expressed in a new 
notation. The problems in psychology for which the correlation coefficient 
or other coefficients of association are most essential are: (a) where it is 
necessary to determine which of several variables bears the highest rela- 
tionship to a given variable, (b) in factor theory, and (c) in certain 
aspects of test theory and practice. 

Sandon (937) approached the problem of selection by means of an 
examination in a unique manner. Instead of using the method, often em- 
ployed of comparing the performance of those individuals “successful” 
in an examination with those who were “nonsuccessful,” he began with a 
theoretical examination correlating .95 with the criterion and observed 
the effect of selection on the correlation coefficient, and computed the 
percent of misfits for different values of r and different critical score 
values. Another part of the investigation consisted in a determination 
of the effect of selection on the relation of two subjects jointly providing 
the bases of selection, as well as of the effect on observed relationships of 
other correlated variables. The significant principle is educed that “the 
other test seems to be the better.” 

Thouless (949) contributed a valuable analysis of the effects of errors 
of measurement on correlation coefficients. He specified the types of 
problems for which correction for attenuation is justifiable, those for 
which the correction is unnecessary, and those for which the correction is 
wholly unjustifiable. A formula was suggested for the valid application 
of the correction for attenuation to partial correlations when correction of 
all the coefficients used is not desired or impossible. The canonical cor- 
relation and the vector correlation discussed by Hotelling (910) may be 
found useful in overcoming certain difficulties not satisfactorily handled by 
present technics in dealing with reliability coefficients and correlations 
corrected for attenuation. 
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Wherry (957) analyzed critically the shrinkage of the Brown-Spearman 
prophecy formula. He stated that the results from the application of the 
formula appear to contain both constant and chance errors. The shrinkage 
of the Brown-Spearman formula can be satisfactorily predicted by the 
Wherry-Smith correction formula. Remmers and Whisler (931) pointed 
out that test reliability is a function of the method of computation. Rulon 
(935) presented a simplified procedure for determining the reliability of 
a test by split-halves. In piace of the usual method of obtaining the stand- 
ard error of the score, i.e. ¢meas.)=o V |—r, he showed that the process can 
be simplified by computing the standard deviation of the differences be- 
tween the scores on the halves of the test, which gives the estimated stand- 
ard error of the score from the whole test. 

Harsh and Stevens (903) reported the construction of a mechanical 
correlator. Raw data are entered into the machine in the form of steel 
balls. By means of a few mechanical operations, the regression coefficients 
are obtained. The correlation coefficient is obtained with the aid of a slide 
rule by taking the square root of the product of the two tangents. 

Two valuable studies dealing with rank correlation have recently been 
published. Hotelling and Pabst (909) stated that the rank correlation is 
of especial value as a test of the presence of correlation without the need 
of assuming normality or other special bivariate distributions. The sig- 
nificance of rank correlations in small samples was determined by cal- 
culating exact probability valucs by means of permutations. For samples 
of five, it was shown that significant values cannot be obtained (P = .01) 
when n is not small enough for the ready determination of exact probabili- 
ties; the Tchebycheff inequality is serviceable but, in general, does not 
lead to an accurate approximation of P. The efficiency of rank correlation 
in estimating P if P is really zero was found to be about 91 percent. An 
illustration was given of combining the information from two independent 
tests of significance, the interpretation of sex differences in achievement in 
a school subject. The procedure, according to the method given by Fisher, 
consisted of adding the natural logarithms of the two probabilities, multi- 
plying by two, and obtaining the probability value for chi-square with 
four degrees of freedom. Kendall (921) worked out a new measure of rank 
correlation which may be applied to such usual types of problems as: (a) 
where an observer arranges a known set of weights in ascending order and 
(b) two observers rank a set of musical compositions in order of prefer- 
ence. The measure employed was the ratio of achieved score to maximum 
possible score. The sampling distributions of the statistics were determined. 

A number of studies of multiple and partial correlation and regression 
are of recent publication. Kurtz (922) presented a modification of the 
Doolittle method of solving normal equations in which a single forward 
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and the same set of independent variables may be computed with only 
slightly more labor than is required to obtain the multiple correlation 
between a single criterion and the independent variables. Horst (908) 
developed a method for securing a composite measure from a number 
of different measures of the same attribute; for instance, obtaining a com- 
posite measure of scholastic ability from a number of measures, such as 
achievement test scores, intelligence test scores, grades, and the like. An 
equation was presented which gave the weights to be assigned the separate 
measures in order to derive the composite score. The linear combination 
of the original measures gave composite measures such that the sum of 
squares of the differences between all possible pairs of measures was a 
maximum. Wilks (960) presented methods providing weighting systems 
for linear functions of correlated variables when there was no dependent 
variable. Three methods were presented for determining the “best” set of 
weights for combining subtest scores into a final score, for small values 
of n: (a) choosing weights such that the generalized variance of the sub- 
test scores of all individuals with a linear score based on such weightings 
is a minimum, (b) determining weights equalizing the correlation be- 
tween each subtest and the total linear score, and (c) deriving weights 
which equalize the increments of the total score variance by including each 
subtest with the remaining subtests. Blankenship (876) presented a method 
for obtaining regression and standard error calculations from normal 
equations. 

Fisher (898) had an excellent method for the solution of simultaneous 
linear equations, particularly when solutions for more than one system 
of dependent variables for the same set of independent variables were 
desired. His solution led directly to the attainment of the standard errors 
of the partial regression coefficients. The method was also highly advan- 
tageous for obtaining multiple regression equations when one or more 
of the independent variables was eliminated. Mosak (926) presented 
a later development for adjusting the regression coefficient for the omis- 
sion of variables. Tucker (953) presented a lucid explanation of the 
method for finding the inverse of a matrix, an essential procedure when 
the parameters of simultaneous linear equations are to be represented 
as the expression of the constant terms of these equations. Wherry (958) 
derived two formulas for estimating beta coefficients, which should be 
of value in obtaining ready approximations. Wren (964) proposed a 
method for the calculation of partial and multiple coefficients of regres- 
sion and correlation, based on a simplified method of solving systems of 
linear equations by determinants. 

Travers (952) applied a method developed by Fisher (900) to obtain 
the particular combination of test scores which best discriminated between 
two occupational groups, successful engineer apprentices, and successful 
air pilots. The discriminant function should prove to be a very valuable 
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technic in the field of educational and vocational guidance. Wallace and 
Travers (955) made use of this technic in their study of a group of 
specialty salesmen. 

Travers (951) attempted to develop a means of eliminating the influ- 
ence of repetition on the score of a psychological test. A statistical method 
of deriving weightings for each item in a test was set up, the weighting 
coefficients being linear functions of the difference between “expected” 
score at retest and observed score at retest. The criterion for the weighting 
coefficients is that the difference between the sums of the products of the 
test and retest totals with the weighting coefficients equals the difference 
between the sums of the products of the test and the expected totals. 


The Testing of Statistical Hypotheses 


While it is becoming common to attach a standard error or probable 
error to a statistic in educational research, the addition frequently serves 
no more than an ornamental purpose. In order to have meaning a standard 
error must be of known validity: for example, the standard error of a sam- 
ple coefficient of correlation, as usually calculated, has no real meaning 
unless it can be assumed that the population distribution is of normal 
form. There is much confusion between problems of estimation and prob- 
lems of tests of significance. The problem of estimation consists of selecting 
the most efficient statistic by means of which the best unbiased estimate 
may be obtained of the unknown population parameter. A test of sig- 
nificance is the process of examining the reliability of data. One should 
select the test of significance appropriate for a specific purpose. The pur- 
pose is to test a particular hypothesis. The problem in the testing of sta- 
tistical hypotheses is to determine whether it is likely that certain param- 
eters have specified values. 

Johnson and Neyman (918) presented tests for a number of linear 
hypotheses with particular application to educational problems. The 
properties an educational problem must possess in order to be translated 
into the form of a hypothesis to be tested were discussed. Problems of esti- 
mation were also considered. Of special significance were the means 
presented for elimination of inequalities in basic characters of groups 
under comparison, rendering unnecessary and undesirable the usual prac- 
tice of matching individuals or certain basic characters for experimental 
purposes. Setting up a “region of significance” provides a means of speci- 
fying the region within which, for values of the basic characters, the 
hypothesis under test is rejected. Johnson (916) extended the application 
to additional educational problems. Walker (954) discussed some funda- 
mental concepts underlying generalizing from sample to population. 
Among the significant topics treated are random sampling, sampling dis- 
tributions, estimation, probability and confidence, and the steps in testing 
a statistical hypothesis. Research workers should profit through reading 
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this rigorous and clear discussion. It has not been possible to include in 
this review any substantial proportion of the extensive statistical literature 
dealing with statistical inference, the testing of hypotheses, fiducial prob- 
ability, confidence belts, and related problems. The publications of Dem- 
ing and Birge (891), Rider (933), Shewhart (939), and Wilks (959) 
dealt with the theoretical foundation of these problems and contained 
reviews and bibliographies of fundamental studies. Pearson and Lekar 
(928) indicated the miajor requirements for the development of efficient 
working tools by the theoretical statistician for the research worker. 

The theory of testing statistical hypotheses and of estimation was 
applied by Jackson (915) to the problem of determining the reliability of 
mental and achievement tests. He introduced a new concept, the sensitivity 
of a mental test, which has distinct advantages over the reliability co- 
efficient. In this significant contribution four separate problems have been 
treated: (a) the determination of the trial effect, (b) the determination 
of whether or not the test actually measures the capacity of the individuals 
tested, (c) the estimation of the trial effect if it is found to exist, and 
(d) the estimation of the relative importance of the random errors of 
measurement with respect to true measures in determining the individual 
test score. 

The chi-square test—Another very much neglected statistical tool in 
educational research is X* (chi-square), which is essentially a test of sig- 
nificance, or a means of testing statistical hypotheses. While not limited to 
such problems, it is the most appropriate test of significance to use when 
dealing with data in the form of frequencies which characterize attributes. 
Merrill (925) used chi-square in testing whether or not test items are 
heterogeneous with respect to difficulty and also with respect to validity. 
Word and Davis (963) applied chi-square to determine the significance 
of the differences between distributions of initial scores and of retention 
scores. Recent publications by Fry (902), Berkson (875), and Camp 
(881) treated the theoretical foundations of the chi-square test and some 
- difficulties in interpretation. 

Yates (966) considered the problem of testing the independence of 
contingency tables involving small numbers, introducing the correction 
for continuity. Sukhatme (947) derived tests of significance for samples 
of the X® (chi-square) population with two degrees of freedom, showing 
that an extension of the t-test can be made for the significance of the 
difference between lowest values in two samples; also a test analogous to 
the analysis of variance tests is proposed for testing the significance of 
the ratio of the standard errors in two samples. Stevens (945) derived 
tests of significance for determining whether a sample or a set of samples 
can be considered to be in the multinomial distribution. Bartlett (873) 
stated that testing the independence in a 2x2 table may be regarded as 
testing the significance of the interaction between the two classifications. 
He derived a test of the second order interaction in a 2x 2 x 2 table. 
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Stevens (944) considered the nature of the distribution of entries in a 
contingency table with fixed marginal totals. He obtained the mean, 
variance, and covariance of the sums of entries in any prescribed set of 
cells, with the condition that no two cells of a set are in the same row 
or column, and no cell is common in two sets. An interesting application is 
made to an experiment designed to test out the telepathic powers of a 
large number of people. 


The Problems of Sampling 


The problems of sampling are so fundamental that it is difficult to 
explain why so little consideration is given to them in educational research. 
Rarely does one find an investigation in which more than cursory attention 
is given to the nature of the sample providing the basis upon which gener- 
alizations are drawn. Yet statistical treatment is made of data when the 
validity of this treatment depends upon the fulfillment of certain conditions. 
Problems underlying the choice of a population, the determination of 
suitable means of access to the population, the unit of sampling employed, 
the method of selecting the sample, the actual selection of the sample, the 
means of securing complete data for the units of the sample, tabulation 
and analysis of data, and the application-of the sample data are all 
important problems for many investigations in education, particularly 
those of the survey type involving often the use of the questionnaire and 
the interview. In experimental studies as well, the validity of generaliza- 
tions depends upon the representativeness and randomness of the sample. 
Stephan (943) gave a valuable discussion of the problems listed above 
as they are found in large-scale surveys. Schoenberg and Parten (938) 
discussed methods and problems of sampling in a study of consumer 
purchases. Bowley (877) considered the application of sampling to 
economic and sociological problems. In his discussion of the amount of 
error in sociological data, McCormick (924) considered factors that are 
equally appropriate for educational data. Roberts and others (934) illus- 
trated in a series of studies on a child population how informative studies of 
this nature can become when rigorous and efficient methods of statistical 
analyses are employed. Among the factors considered were the method 
of ascertainment of the sample and the form of the frequency distributions, 
with special consideration of the form of the lower end of the frequency 
distribution of Stanford-Binet Intelligence Quotients. Merrill’s study (925) , 
previously mentioned, considered the role of sampling theory in test item 
analysis. 

Unrestrictive sampling and stratified sampling are methods of random 
sampling usually considered as aspects of the representative method of 
sampling. In the former, single individuals are drawn at random from the 
population, with or without replacement; in the latter, the population is 
divided into several strata and the sample is composed of partial samples, 
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each being drawn at random from one or other of the strata. The method of 
purposive selection, sometimes used, involves intuitional dependence on 
correlation between certain values sought and one or more known values. 
Sukhatme (946), Neyman (927), and David (890) contributed to the 
theory of sampling human populations, including discussions of the repre- 
sentative method. 

An important statistical problem is that of determining whether or not, 
given a random sample, it may have come from a certain population either 
partially or completely specified. One method is to start from the popula- 
tion and to determine the probability that a given sample should have 
come from this population. Another method is to begin with the sample 
and to determine the probability that a specified population is the one 
sampled. In order to examine the reliability of a statistic, it is necessary to 
know the form of its sampling distribution. Not all these sampling distribu- 
tions are normal in character. Hey (906) took samples from four non- 
normal populations and determined the sampling distributions of corre. 
lational coefficients, regression coefficients, and variance ratios correspond- 
ing to degrees of freedom 3:4, 3:12, and 4:12. He determined that the 
sampling distributions of these statistics were sufficiently normal to use 
the usual tests of significance for the four nonnormal populations sampled. 
An economical method, using the tabulating machine, was worked out for 
the computations necessary in the sampling investigations. Shewhart (939) 
and Rider (933) have excellent discussions of developments in sampling 
theory. 


Miscellaneous 


Kelley (920) contributed an improved derivation for the determination 
of optimum upper and lower groups for the validation of test items. 
Bradway (878) described the use of the Thompson method for studying 
scale items, validating items, determining the diagnostic value of items, 
and investigating the handicaps of groups with sensory disabilities. 
Dunlap (894) studied the relationship between the type or form in which 
test questions are presented and scoring errors. Hertzman (905) derived 
equations for calculating the sum or the average of all the possible differ- 
ences, and the sum of squares of all possible differences, in a distribution 
of scores. Saffi (936) made a comparison of scales constructed by the 
method of paired comparison of rank order, and by the method of successive 
intervals. Hilgard (907) evaluated alternative procedures for the construc- 
tion of Vincent curves and suggested desirable means for choosing between 
the various alternatives. Du Bois (892) proposed a time-saving method for 
computing means and sigmas. Toops (950) published an informative 
bulletin on Hollerith coding. 





CHAPTER XVI 


Classroom Experimentation’ 


MAX D. ENGELHART ? 


Ls ris REVIEW an effort is made to bring down to date the corresponding 
summary of Monroe (981) which was published in February 1934. In 
that summary and in certain chapters of the texts on educational research 
technics published in 1936 by Good, Barr, and Scates (973), and by 
Monroe and Engelhart (984), are included most of the ideas on experi- 
mental procedures which the educational experimenter should know today. 
In the present review there is some repetition of these ideas because of 
their importance, but the main emphasis is upon certain statistical technics 
which are new to educational experimentation and which may greatly 
modify it. 

Some years ago the status of controlled classroom experimentation was 
compared with the plateau in a learning curve (983). Progress seemed to 
be retarded pending the perfecting of more adequate technics, a condi- 
tion somewhat analogous to that during learning when lower order habits 
have been formed and improvement is at a standstill until higher order 
habits have been developed. There now seems to be some reason to feel 
that the plateau period has been passed and that progress in experimenta- 
tion has again acquired some degree of acceleration. While many, and 
possibly most, of the experiments currently reported in the literature have 
limitations which have been condemned by competent critics for more 
than a decade, an increasing number show evidence of serious attempts to 
avoid these limitations. Furthermore, there have recently developed efforts 
to create, or to adapt from other scientific fields, technics to overcome 


obstacles to dependable educational experimentation formerly thought to 
be insurmountable. 


The Experimental Problem and the Experimental Factor 


The typical problem investigated by means of controlled classroom ex- 
perimentation calls for a determination of the relative effectiveness of two 
methods of teaching, of two kinds of instructional materials, or of two 
types of class or school organization. One of the compared procedures is 
administered to the pupils of a given group, while the other procedure is 
administered to the pupils of an equivalent group. Each group may con- 
sist of several classes. The difference between the compared procedures 
represents the experimental factor or, more precisely, the change in the 
experimental factor. The difference between the final achievement means, 
or between the mean gains in achievement, of the pupils of the two groups 


1 Bibliography for this chapter begins on page 629. 
*The author is indebted to E. F. Lindquist for a number of valuable suggestions. 
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is taken as an index of the relative effectiveness of the compared procedures 
—i.e., of the effect of the change in the experimental factor. Various statis. 
tical technics are employed in efforts to ascertain the dependability of 
this index. 

The experimental factor investigated need not be restricted to the types 
referred to above. Any factor that can be applied or controlled by the 
investigator and that may or may not produce measurable effects on the 
achievement or other traits of children may legitimately be the subject 
of classroom experimentation. Furthermore, an experiment need not be 
restricted to the investigation of the effect of one change in a single experi- 
mental factor. The effects of different changes in a given experimental 
factor, or the effects of different experimental factors, may be studied 
through the use of different and not necessarily equivalent groups. It is 
probable that many experiments of the type just suggested will be con- 
ducted in education as educational experimenters assimilate and adapt 
to educational conditions the ideas of Fisher respecting the design of 
experiments (971) and his technics of the analysis of variance and covari- 
ance (972) which are useful in the interpretation of experimental data. 

Many currently reported experiments may be criticized for their inade- 
quate definitions of the experimental problem. Unless the experimental 
factor is clearly defined, the hypothesis tested is vague both to the experi- 
menter and to the reader of his report. The experimental factor cannot 
adequately be administered to the pupils participating in the experiment, 
or can the effects noted be safely ascribed to it as their cause. An experi- 
mental factor involving a change in method can best be defined in opera- 
tional terms. For example, instead of defining the method of teaching 
employed with the pupils of one group as the project method and that em- 
ployed with the pupils of the other group as the assignment method, the ex- 
perimenter should state just what specific activities are carried on by the 
teacher of the first group as contrasted with the specific instructional activi- 
ties of the teacher of the second group. If materials of instruction are com- 
pared, their specific differences should be noted. A similar statement may be 
made with respect to compared types of class or school organization, or 
other kinds of experimental factors. The comprehensiveness of definition 
should be a function of the complexity of the experimental factor investi- 
gated. 


Control of Nonexperimental Factors 


The difference between the final achievement means, or between the 
mean gains in achievement, of the pupils in two groups is the resultant of 
many causes. In the ideal experiment, all of the difference is attributable 
to the change in the experimental factor, and the net effects of other 
causes are zero. It is the causes whose net effects must be compensating, 
or zero, which are termed the nonexperimental factors; such factors include 
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basic pupil traits, instructional procedures and materials not inherent in 
the experimental factor; zeal, skill, and other characteristics of the teacher; 
and various influences pertaining to the school, the homes, and the envir- 
onment of the pupils. 

Various technics have been used in efforts to secure control of basic 
pupil traits or the equivalence of groups. Probably the most commonly 
used procedure is the matching of pupils on the basis of measures of in- 
telligence. Sometimes the matching is done on the basis of initial achieve- 
ment measures or a combination of intelligence and initial achievement 
measures. An occasional experimenter has equated groups on the basis of 
learning curves or on the basis of composite scores derived from several 
tests. Possibly, with further development of factor analysis, some experi- 
menter may attempt to secure equivalence by pairing, on the basis of pupil 
measures, of relevant primary factors and to measure the effect of the 
experimental factor by a factor analysis of data secured at the close of 
the experiment. It is possible that the control of pupil traits by matching 
procedures may receive less emphasis as a result of the use of the variance 
technics in the analysis of experimental data. It will always be important, 
however, to seek adequate measurement of basic pupil characteristics in 
order to account for differences between groups and to formulate gen- 
eralizations concerning the effect of the experimental factor. 

One of the major criticisms of contemporary classroom experimentation 
concerns the control of nonexperimental factors other than pupil traits 
(973, 981, 982, 983, 984). Meyers (980) recently produced a study which 
indicated the continued neglect of experimenters to obtain adequate con- 
trol of instructional procedures and experience, skill, and zeal of teachers. 
Control of instructional procedures is most likely to be obtained when the 
procedures to be used are specified in detail. Care must be taken, however, 
that such specification does not result in violation of sound educational 
practice. Experience and skill may be controlled by having the same teacher 
teach both groups after practice with both of the compared methods or 
materials. Zeal may be controlled by the teacher’s acquisition of a sci- 
entific attitude. Control of experience, skill, zeal, and other important 
nonexperimental factors may best be promoted by providing a number 
of pairs of groups in a variety of schools, each pair of groups being taught 
by the same teacher. When an experiment is thus replicated, the chances 
for compensating systematic errors inherent in noncontrol in the different 
pairs of groups are increased. 


Duration of Experiment 


Many contemporary experiments have resulted in inconclusive results 
because of failure to conduct the experiment over a period sufficiently long 
to result in a significant difference in achievement. The importance of 
experimental duration is recognized in a citation of the Committee on 
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Awards of the American Educational Research Association respecting an 
experiment conducted by Hardy and Hoefer (974). “After two years of 
preliminary study of the local situation, a five-year controlled experi- 
ment was organized for the purpose of ascertaining the effects of certain 
health instruction procedures” (968). Recognition of the importance of 
experimental duration is also characteristic of the investigation of the 
merits of progressive education being carried on over a period of years 
under the direction of Tyler (989). 


Measurement of Achievement 





Much could be written concerning the importance in classroom experi- 
mentation of securing valid measurement of the final achievements or gains 
in achievement of the pupils participating in an experiment. The reli- 
ability of the test or tests used is not often a matter of concern, and it is 
relatively easy to make allowances for the variable errors of measurement 
which are due to test unreliability. Much greater concern should be given to 
the validity of the measures employed. A test may be biased in validity 
and thus cause systematic errors of validity. A mean difference in achieve- 
ment may be due, not to the inherent superiority of one of the compared 
procedures, but to the fact that the test is in part measuring certain irrele- 
vant factors which are more prominent in one group than in the other. 
A frequent cause of this type of experimental limitation is the employ- 
is ment of a test which measures a restricted range of abilities. Usually, an 
experimental problem calls for measurement of more than fixed associa- 
tions or motor skills. There is usually an implicit requirement of measure- 
ment of more general abilities, including skill in reflective thinking about 
the subjectmatter of the experiment and of attitudes, ideals, and interests 
created or modified by the experimental factor. Fortunately, an increasing 
number of experimefters are recognizing the necessity of comprehensive 
measurement of all relevant outcomes. The experimentation directed by 
Tyler (989) may again be cited in this connection. 


rate a atatag 


Testing the Statistical Significance of Observed Differences 


The usual procedure in organizing the data of an experiment is to 
tabulate distributions of the scores of the pupils on the achievement test 
administered at the conclusion of the experiment. If an initial and final 
achievement test have been given, gains in achievement may be tabulated. 
provided that the test forms are equivalent or that derived measures are 
used. It is more desirable to use gains in achievement than to use final 
test scores (978). The mean final achievement, or mean gain in achieve- 
ment, may then be calculated for each group, and the difference in mean 
final achievement, or mean gain in achievement, next obtained. Another 
procedure which may be used with paired pupils is to tabulate the indi- 
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vidual differences in achievement or gains in achievement of the various 
pairs. The mean of such a distribution is, of course, equivalent to the dif- 
ference between the final means, or between the mean gains, previously 
mentioned. After the difference has been calculated, the experimenter is 
confronted with the problem of testing its dependability. 

If the assumption of independent random sampling is satisfied * and 
the experimenter has calculated the standard deviations of the distribu- 
tions concerned, the standard errors of the final test means, or of the 
mean gains,* may be computed and substituted in the short formula for 
the standard error of a difference between two means: 


o =Vo'm, + o'm, 


Difference 
Mi — Me 


& 


In an experiment in which pupils have been paired it is possible, but 
seldom justifiable,’ to compute the standard error of the difference be- 
tween the final test means, or mean gains, by the formula just given ® or 
to calculate the standard error of the difference between the final test 
means, or mean gains, through use of the long formula: 

o =avV/g +62 —2¢ @ Tru 
M M 


Difference 
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The symbol rj2 refers to the correlation between the paired achieve- 
ment measures of the pupils of the two groups. Some authorities have 
questioned the legitimacy of calculating the correlation between pairs of 


3 To satisfy this assumption the pupils would need to be selected strictly at random from the popula- 
tion to which the generalizations derived from the data are to apply. Because of administrative diffi- 
culties this is almost never possible or attempted in educational experimentation. Furthermore, to 
satisfy this assumption for equivalent groups, the groups would need to be equivalent wholly as a 
result of chance and not as a result of the usual pairing procedure in which eliminations are made 
of pupils who do not match with sufficient precision. 


o o 
‘If the distributions are final test scores, the formulae are Om: = — and om: = — where 0; and @» refer to 


N 

the two distributions of final test scores. Where the two distributions are distributions of gains, the same formulas 
may be used. The symbols M: and Mz then refer to mean gains, and 0; and @: are standard deviations of the distri- 
butions of gains. If one is dealing with distributions of equivalent initial and final measures of each group, the 
mean gain of each group is the difference between the means of the initial and the final measures of the group. The 
standard error of the mean gain of each group is then calculated by the formula 


eas Voit + 07% — 204 09 v9 





where 0% and @+ refer to the appropriate initial and final measures and ri/ is the correlation between the paired initial 
and final measures of the given group (9, 14). 

5Groups made up of pupils paired according to the usual technics are not independent random 
samples. If the pupils of the two groups were selected strictly at random, as described in a preceding 
footnote, and if pairing could be accomplished without eliminations, the assumption of independent 
random sampling would be satisfied. 

® Generalizations should be restricted to a population of the same distribution of measures for which 
the groups are equivalent, i. e., intelligence measures or measures on an initial achievement test. 
This also applies to the use of the short formula where groups are equivalent as a result of chance and 
no effort is made to pair the individuals in the groups. Successive samples should have the same level 
and range of basic characters, and, hence, generalizations should apply to similar populations. Use of 
the long formula does not involve this restriction. See Walker (990). 
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scores of different pupils, that is, the type of correlation coefficient re. 
ferred to above. Correlation coefficients are usually calculated for paired 
scores of the same pupils. 
ti If the individual differences in achievement, or the individual differ- 
Vt ences in the gains in achievement, of the various pairs of pupils have been 
tabulated and the standard deviation of this distribution calculated, it is 
possible to compute the standard error of the mean of the distribution of 
differences. (The usual formula for the standard error of a mean is used.) 
As has been mentioned, the mean of such a distribution of differences is 
equal to the difference between the means of the distribution of achieve- 
ment measures of the two groups. The standard error of the mean of the 
distribution of differences is also numerically equal to the standard error 
calculated through the use of the long formula. Again, since paired groups 
are usually not independent random samples, the procedure described is 
seldom, if ever, justified.’ 

If, however, the pupils of one group have been selected at random, 
an equivalent group is chosen through use of the matching technic, and 



























{ the correlation between the measures used in matching and the measures 
of achievement is calculated, the experimenter may legitimately use the 
formula of Lindquist and Wilks in obtaining the standard error of the 
difference between the means of the two groups: 
a =V(? +e )(l1—?) 
i Difference Mi M2 
he ' Mi — Me 





The generalizations should refer to populations having the same distribu- 
tion on the measures used in matching; populations of the same level and 
range in intelligence where matching is done on the basis of intelligence 
measures, or populations of the same level and range in achievement where 
matching is on the basis of initial achievement measures. 

If the mean of the individual differences has been obtained and the 
standard error of this mean has been computed in the usual way, it may be 
corrected by multiplying by the term +4/1—,72 where r here refers to the 
correlation between the measures used in matching and the individual dif- 
ferences in achievement, or in gains in achievement, of the paired pupils. 
The generalization should refer to a “universe of individual differences” 
and to’ populations restricted as indicated above. 

Underlying assumptions—lIt is not justifiable for an experimenter to 
employ any of these procedures without seeking an understanding of the 
assumptions underlying them and the limitations in their use. The inter- 
ested reader should consult the original papers of Lindquist and Foster 
(978), Walker (990), Lindquist (979), and Wilks (991), the discussions 


7 See preceding footnotes. Ezekiel (969) has pointed out that the generalizations should refer to 
“a universe of individual differences,’ a not very meaningful concept in educational research. This 
would also seem to apply to the numerically equal standard errors calculated by means of the long 
formula. Incidentally, the method involving the calculation of the standard error of a distribution of 
individual differences is credited to Student. 
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of Monroe and Engelhart (982, 983, 984), the attack on Lindquist and 
Wilks’ formula by Ezekiel (970), Lindquist’s reply (977), and Ezekiel’s 
rebuttal (969). If the experimenter is interested in generalizing to a re- 
stricted population of which the pupils participating in the experiment 
are a sample, he should consult the paper of Peters and Van Voorhis (986). 

Combination of measures—Peters (985) recently contended that experi- 
menters have all too frequently concluded that the two methods of teach- 
ing being compared are probably of equal value, though the odds in 
favor of one of the procedures “may be several hundred to one that it is 
superior, though somewhat less than the seven hundred and forty to one 
that the ratio of three indicates.” * He also argued for the summing of 
scores on several achievement tests, and showed that a difference thus 
obtained should have a greater ratio to its standard error than the ratio 
obtained from data secured by means of only one of the tests. The effect of 
variable errors of measurement on the ratio is reduced. Lengthening the 
final achievement test would tend toward the same result, but for the dis- 
turbing effect of fatigue. Peters also held that invalidity of the test used 
to measure achievement tends to decrease the ratio between the difference 
and the standard error of the difference. Variable errors of validity, defined 
by Peters as one type of the chance factors unrelated to the experimental 
factor, tend to be neutralized in the difference, since the difference is essen- 
tially an average. The same errors or chance factors, however, tend to 
augment the standard deviation of the individual differences and conse- 
quently tend to augment the standard error of the difference. Peters also 
argued that replication of experiments with later ones and appropriate 
combining of data tend toward an increased ratio between the difference 
and its standard error. 

When experiments are replicated and the data combined, systematic 
errors pertaining to the various groups tend to compensate each other 
in the combined means. The difference calculated may thus be freed of 
the effects of errors which are systematic for single groups, but not for 
the total. The standard deviations of the pooled data will be augmented, 
however, and the standard errors obtained from them spuriously high. 
While this limitation should result in the derivation of conservative con- 
clusions, the limitation may be avoided by use of the technics of the 
analysis of variance. 

Analysis of variance—When the analysis of variance technics are used, 
the effects due to different factors are segregated. Given an appropriately 
designed experiment, one can test the relative significance of the em- 
ployment of different teachers, different schools, different methods of 
instruction, different materials of instruction, and other causes, with ref- 
erence to each other and to the experimental error—where the last named 
refers to variable errors of measurement and of sampling and to other 


® McCall's experimental coefficient equals unity, the critical point, when the difference is 2.78 times 
its standard error and the odds are 369 to 1. 
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irrelevant factors. With this technic, the equating of groups diminishes 
in importance and we draw away from the classic emphasis in experi- 
mentation of “varying the essential conditions only one at a time. . 
This ideal doctrine seems to be more nearly related to expositions of ele- 
mentary physical theory than to laboratory practice in any branch of 
research” (971). 

Use of the variance technics was advocated by Lindquist and Dunlap 
at the February 1939 meeting of the American Educational Research Asso. 
ciation. After brief reference to the advantages of the variance technics, 
Lindquist concluded with a timely admonition: “Let us hope that, as we 
first briefly glimpse these possibilities, we will not rush to apply these 
technics with the excess of zeal and the lack of critical consideration which 
have so often characterized our first uses of new technics in the past” (968). 

Johnson-Neyman technic—At the same meeting, Johnson described a 
technic which he and Neyman (975) have developed and which represents 
an outgrowth of technics devised by Fisher. This procedure has the gen. 
eral purpose of determining the significance of differences between the 
achievement measures of two or more groups. The groups may be equiva. 
lent, but this is not essential. In fact, one can use the technic in testing 
hypotheses concerning the effectiveness of a given educative factor, or 
of contrasted educative factors, where the groups differ in one or more 
basic characters. The investigator can avoid the administrative difficulties 
inherent in efforts to set up paired groups prior to the application of an 
experimental factor, or the loss of data usually occurring when pairing 
is carried out subsequent to experimentation. The number of basic char- 
acters considered need not be restricted to measures of a single trait—for 
example, intelligence test scores. In most controlled experiments such a 
restriction occurs because of the complexity of matching on the basis of 
more than one criterion. Finally, the technic of Johnson and Neyman takes 
into account the relationship between the initial and final achievement 
measures. It does not treat them as independent. In a recent letter to the 
writer, Johnson named the following additional uses of the technic: 

(1) The significance of the differences between partial regression coefficients and 
between multiple regression equations can be tested.° 

(2) A region of significance can be set up, such that for values of the basic 
characters lying within this region the hypothesis (i. e., the null hypothesis that there 
is no difference in the means under comparison) would be rejected. In this way, 
the inferences drawn need not be restricted to the specific mean values of the basic 
characters of the samples under consideration. Setting up the region of significance, 


therefore, makes use of more information given by the samples concerning th 
population values. 


According to Lindquist, the technics of Johnson and Neyman assume 
simple random sampling of pupils, a limitation which restricts their 


® Kimball (976) also devised technics involving the comparison of regression equations based « 
experimental data. The procedure yields an estimate of whether or not the same difference can be 
expected for all values of X, the basic character, by testing the significance of a difference between 
regression coefficients. This contribution is also an outgrowth of the work of Fisher. 
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application in educational research. Fisher’s technics serve the same pur- 
poses better since they are more conveniently applied to samples consist- 
ing of intact school groups. 

The reader interested in obtaining further information concerning these 
new technics should consult the papers mentioned (975, 976), the texts 
by Fisher (971, 972), and the monograph and text by Snedecor (987, 
988). Unfortunately, for educational workers, these texts were written 
for the fields of biology and agriculture. There is a definite need for a text 
which will clearly demonstrate the applications to experimental educational 
research. 


A Concluding Statement 


An effort has been made herein to indicate the trend in controlled experi- 
mentation in education. This trend will represent progress if applications 
of the new technics and scientific attitudes are concomitants. As time goes 
on, we anticipate more dependable solutions to problems concerning the 
effects of various educative factors and, in consequence, more defensible 
decisions regarding what should be done in the practice of education. 





CHAPTER XVII 


Laboratory Investigations’ 


L. C. GILBERT 


The LABORATORY ATTACK upon educational problems is admittedly tedi- 
ous, expensive, and limited to relatively small numbers of subjects. Never. 
theless, the objectivity and accuracy of the laboratory technics qualify 
them uniquely for the investigation of fundamental principles underlying 
learning and the nature and capacities of the child. Research reports dur- 
ing the past three years have indicated increased interest in the develop- 
ment and refinement of laboratory apparatus and technics, and in their 
application to educational problems. The statistical aspects of experi- 
mentation are treated in Chapter XIV of this issue of the Review. 

Eye movement studies have been the most numerous largely because of 
the mechanical excellence of the present photographic apparatus, the 
wealth of background data, and the general recognition of eye movements 
as objective symptoms of what transpires in the central nervous system. 
Electro-encephalography, formerly confined to medical and psychological 
studies, has recently been employed in an investigation of brain potentials 
in oral and silent reading. Brain potentials, like eye movements, are recog- 
nized as indicators of activity within the central nervous system and lab- 
oratory studies in this field may be expected to yield information highly 
important for the understanding of child development and learning. 

To an increasing extent laboratory studies are manifesting a tendency to 
cross educational boundaries into such fields as music, neuropsychiatry, 
medicine, anthropology, and optometry. One study, for example, inquired 
into the relationship between basal metabolism and intelligence, another 
investigated the effects of oxygen deprivation on reading. Because of the 
interdependence of visual, physiological, psychological, and educational 
factors this tendency is to be regarded as scientifically sound. 

Also encouraging is the increasing interest of investigators in developing 
or adapting instruments and procedures for the study of their own particu- 
lar problems. While much of the apparatus described in the studies possesses 
obvious limitations, and is of limited use, the tendency reflects a dynamic 
research attitude. Certain important reports have been summarized and are 
presented to illustrate these trends. 


Eye Movement Studies 


Extension, refinement, evaluation of investigational technics—Recent in- 
vestigations reported by Buswell (997) and by Tiffin and Fairbanks (1043) 
employed eye-voice cameras designed to produce simultaneous graphic 


1Bibliography for this chapter begins on page 630. 
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records of eye movements and voice; phonograph recordings provided 
permanent reproductions of the vocal performance. A small portable bin- 
ocular camera intended to make possible the photography of large numbers 
of subjects was described by Taylor (1042). 

Tinker (1045) measured the accuracy of motor control of sixty-four uni- 
versity students and found no significant correlation between accuracy of 
visual fixation and reading proficiency except in extreme cases. Sisson 
(1038) presented evidence tending to discredit the usefulness of the concept 
of “short lived motor habits,” that is, of characteristic rhythmical series of 
the same number of pauses per line and patterns involving a long initial 
pause, several pauses of decreasing length, and a final rather long pause. 

From a study of visual factors in reading, Imus, Rothney, and Bear 
(1018) concluded that for Dartmouth freshmen the Ophthalm-O-Graph is 
unreliable, not a valid measure of reading ability where standardized tests 
are criteria, and that scores are not closely related to academic achievement 
and cannot be used for individual diagnosis or for grouping for remedial 
reading instruction. It would appear that the reading materials selected are 
largely responsible for these findings. Anderson (994) found most eye 
movement measures valid enough for group comparisons. Tinker (1044) 
pointed out that experimental evidence indicates that eye movement records 
of reading are reliable for individual diagnosis if twenty-five or more lines 
of print are employed, and that validity is low if eye movement measures 
are compared with test scores when the reading materials are not compa- 
rable; if strictly comparable materials are used, perception time and fixation 
frequency are highly valid measures. 

Further data on the electrical recording of eye movements were reported 
by Mowrer, Ruch, and Miller (1032), who advanced the corneo-retinal 
hypothesis that there is a persistent potential difference between the back 
and front of the eyes and that galvanometric effects are associated with 
changes in the electrical field. Halstead (1013) effected quantitative record- 
ings of horizontal and vertical movements with the eyes open and closed and 
with the subject moving and at rest; changes traceable to the corneo-retinal 
potential were detected, amplified, and recorded by means of electrodes 
attached about the orbits, a shielded cable, appropriate amplifiers, and con- 
nected markers. Fenn and Hursh (1007) reported that in any one individual 
the potential tends to be constant, but that it varies from individual to indi- 
vidual. Hoffman, Wellman, and Carmichael (1017) compared quantita- 
tively the relationships between voltages produced by eye movements and 
the extent of these movements as recorded simultaneously on film, and 
reported sufficient reliability of the electrical method to justify its use for 
quantitative recording in research. 

Identification of eye movement characteristics—Gray’s summaries (1011) 
of investigations relating to reading included a number of eye movement 
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studies: among them, LaGrone’s report (1025) on the eye movements of 
deaf children, Fairbanks’ study (1005) of the eye movements and voice in 
the oral reading of good and poor readers, Swanson’s identifications (1041 
of common elements in oral and silent reading of poor readers, and a 
study by Anderson and Swanson (993) of the relationship of eye movement 
measures in oral and silent reading. McFarland, Knehr, and Behrens 
(1028), studying the effects of oxygen deprivation on reading, reported a 
decrease in precision, comprehension, and efficiency of ocular movements: 
subjects tended to acclimatize at oxygen percents of 12.5 but not 10.5; the 
average reading time per line and adjustment during fixations appeared to 
be sensitive measures of anoxemia. 

Eye movement training experiments—Sisson (1037) equated three 
groups on the basis of reading test scores, trained one group in fixating 
material when and where it was assumed fixations should fall, trained the 
second group to read comparable material for speed and general meaning. 
and used the third as a control. Eye movements were photographed before 
and after a four-week training period. He concluded, that for rate, eye move- 
ment training was no more effective than reading with intent to improve, 
and suggested the possibility of decreasing comprehension by directing 
attention to ocular process. Had eye movement measures been used in equat- 
ing the groups and had systematic tests for comprehension of camera ma- 
terial been employed, a more dire_t appraisal of the technic would have 
been possible. 

In a comprehensive study by Buswell (997) visual examinations, reading 
tests, information indices, eye movement photographs, and vocal records 
were used to appraise the reading of 1,000 adults ten years or more out of 
school. His first remedial training experiment emphasized comprehension. 
the development of a broader span, sureness, speed, and the reduction of 
vocalization. The second gave specific instruction in word recognition, in 
adapting reading to its purpose, and in increasing eye-voice span. The 
results demonstrated the possibility of improving adult reading, but showed 
greatest gain for the youngest subjects. 


Other Photography 


Normative studies (992, 1022, 1029) have continued to use motion pic- 
ture photography for recording infants’ spontaneous behavior and reactions 
to experimental situations. Gardner (1008) used a motion picture camera 
and accessory apparatus to photograph the pupillary reflex during different 
types of stimulation and during stuttering. Posture silhouettes obtained with 
an Eastman 2A camera suggested to Crook (999) a scale for rating antero- 
posterior posture. Jones (1021) listed front-side-rear photographs obtained 
with a specially built camera as an important feature of the California 
adolescent growth study. 
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Electr 0-Encephalography 


In 1938 Jensen ,1019) summarized eighty-eight investigations reported 
since 1848 relating to electrical activity of the nervous system and described 
apparatus and recording technics. Additional contributions include studies 
by Cruikshank (1001) of the effects of visual stimulation on brain poten- 
tials, by Martinson (1030) of brain potentials during mental blocking, by 
Travis and Hall (1047) of the effects of visual after-sensations upon brain 
potentials, and a comparison by Raney (1034) of the lateral dominance 
and brain potentials of identical twins. Lindsley (1026, 1027) and Smith 
(1039) identified brain potential characteristics of infants, children, and 
adults. Of particular interest to educational psychology is a preliminary 
study by Knott (1024) of brain potentials during oral and silent reading; 
potentials appeared most stable during minimum stimulation, less stable 
ducing propositional speech, still less stable during silent reading, and 
least stable during a combination of reading and speech. 


Tachistoscope Studies 


Laboratory equipment was used for testing and teaching by Bean (995), 
who mounted a twin tachistoscope on a piano in order to study span of 
perception in reading music; he found a low positive correlation between 
years of music training and perceptual span, and demonstrated an effective 
technic for changing part readers into pattern readers. Swanson (1041) 
used a tachistoscope and sound recording in identifying the common ele- 
ments in poor silent and poor oral reading. Eames (1004) reported speed 
of perception slower with poor readers than normal readers, and noted 
some increase in speed with treatment of visual difficulties and training. 
A modified form of the Dodge mirror tachistoscope was used by Keller 
(1023) in a study of ocular dominance and range of visual apprehension; 
the findings were interpreted as indicating a functional relationship between 
retinal halves whose neural connections terminate in the same hemisphere. 
Similar apparatus was used by Crosland (1000) , who reported that superior 
readers excel poor readers in the left visual field and inferior readers excel 
superior readers in the right visual field; kinship was inferred between left 
eye dominance and effective reading. 


Studies of Visual Factors and Reading 


Variable results obtained with entering school pupils by Gates and Bond 
(1010) from the Betts Ready To Read Tests and Telebinocular were attrib- 
uted at least in part to the inexperience of young pupils in taking tests; the 
frequency of indicated difficulties suggested the desirability of thorough- 
going visual examination for all entering children. Using the same tests, with 
850 pupils from kindergarten to Grade VI, Wagner (1048) was able to show 
a definite and sometimes statistically significant maturation of certain 
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visual factors with age and a positive relationship between normal func- 
tioning and success in reading. 

Farris (1006) concluded from a study of seventh-grade pupils that, bar- 
ring myopia, hyperopia, and strabismus, visual defects have little effect on 
progress in reading. Witty and Kopel (1050) summarized a number of im- 
portant studies and concluded that poor readers were not characterized by 
a greater incidence of visual defects and anomalies than good readers. 

In a study using the Ophthalmo-Eikonometer, Dearborn and Anderson 
(1002) found that reading disability was more directly related to anisei- 
konia at the near than at the far point, that aniseikonia is one factor in 
50 percent of extreme cases of reading disability, and that it probably 
differentiates good and poor readers better than other eye defects. Imus, 
Rothney, and Bear (1018) concluded that when Dartmouth freshmen were 
grouped according to ocular defects there were no significant differences 
in performance or gains in reading test results, eye movement records, or 
academic standing, and that correction of ocular defects did not guarantee 
immediate improvement. 


Miscellaneous Technics and Apparatus 


Severe space limitations forbid more than brief mention of selected 
samples of other laboratory studies relating to educational problems. Ber- 
rien (996), using a digitalgraph, concluded that an atypical composite 
index derived from finger oscillations of normal college students is evidence 
of emotion, but a typical index does not guarantee absence of emotion. 
Limitations of the electro dermal technic for the measurement of attitudes 
were pointed out by Chant and Salter (998) , who noted spontaneous deflec- 
tions from coughs, sighs, and the like, and who found that galvanic re- 
sponses may be occasioned by the difficulty of making decisions rather than 
by the nature of the opinion. Among the audiometer studies was one by 
Hall (1012) indicating that for college freshmen auditory acuity is not a 
differentiating factor between normal and defective speaking. Other labora- 
tory studies have correlated metabolism and intelligence (1016), muscular 
tension and learning (46), and hand and eye dominance in relation to 
reading (1009, 1049). 

Growth studies summarized in a recent number of the Review of Educa- 
tional Research (1031) included descriptions of such laboratory instru- 
ments and devices as the anthropometric board, improved sliding calipers. 
the cephalo dentometer, and the craniometric slide compass. Johnson and 
Evans (1020) described apparatus for measuring visual accommodation 
from light to darkness. A chronoscope with ten pointers for group studies of 
choice, serial response, and the like, was devised by Hertel and Dunford 
(1014). Schlosberg (1036) calibrated nine reaction time instruments of 
seven different types and found errors so general as to indicate that for 
careful research all chronoscopes should be checked. 
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CHAPTER XVIII 


Organized Research in Education: Foundations, 
Commissions, and Committees ' 


CARTER V. GOOD 


E HAS SEEMED APPROPRIATE in a number dealing with educational 
research as a process to include some description of organized provisions 
for research. In view of the fact that these provisions include both national 
and local agencies in the form of official, voluntary professional, and lay 
groups, and in view of the extensiveness of the activity, it has seemed best 
to treat the subject in two chapters. The present chapter will deal with 
foundations which have supported educational research, with the American 
Council on Education which is essentially a large research commission, 
and other commissions and deliberative committees. The chapter which 
follows this will deal with research bureaus and departments in national, 
state, and local organizations. 

No previous issue of the Review has dealt with this topic specifically, so 
that a definite time limit for the earliest material to be included cannot 
readily be set. The majority of the references cited however are of recent 
date. The treatment has had to be made sketchy to keep it within reasonable 
limits; in general, individual publications of the different organizations 
have not been cited, but only summarizing treatments which contain 
descriptions of these works. Where there were no such summarizing treat- 
ments, no reference for an organization may appear in the bibliography. 


Philanthropic Foundations 


Organized research, on a national as well as an international basis, owes 
a large debt to the philanthropic foundations. Educational and social 
inquiry has been promoted through endowments for higher institutions, 
fellowships, exchange lectureships, subsidies granted to investigators, and 
through the studies and publications of the foundations themselves. 

An especially comprehensive treatment by Hollis (1073) of foundations 
in relation to higher education considered the foundation as a social 
institution in terms of historical development, policies, and organization; 
and described its activities for higher education in the areas of defining 
the college, disseminating information, student and faculty welfare, endow- 
ment and capital outlay, professional education, nonprofessional educa- 
tion, and distribution of grants. The conclusion was reached that higher 
education has received approximately $680,000,000 with an emphasis 
directed increasingly toward social and cultural ideas, and toward adapta-y 
tions to a rapidly changing civilization. While recognizing that frequently 
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1 Bibliography for this chapter begins on page 633. 
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the foundations have lacked social awareness and outstanding leadership, 
have failed to anticipate educational trends, and have been opportunisti 
in seeking the improvement of higher education, Hollis suggested that 
with increasingly large and diversified sources of revenue the foundations 
may do even more for higher education in the second than in the first third 
of the twentieth century. 

Rio’s congratulatory account (1092) of the work of thirteen foundations 
proposed to analyze the effects of the depression on the grants awarded, 
although the problems discussed were not closely related to this purpose. 
Lindeman (1079) presented a statistical summary of the grants of a 
hundred foundations for the decade 1921-30, and was particularly critical 
of the foundation as a cultural agent, and of donors and trustees. Keppel 
described the foundation as a social institution (1075) and analyzed the 
relationship between philanthropy and learning (1076). Coffman’s study 
(1066) was a sociological treatment of a decade of activity, 1921-30, on 
the part of 55 foundations, 20 community trusts, and 32 child welfare 
organizations, in the area of child welfare. Leavell (1078) included con- 
sideration of the foundation as a part of the total philanthropic aid to 
Negro education. Sears (1093) reviewed the various types of philanthropy 
in American colleges, beginning in the early Colonial period. Gee’s ques- 
tionnaire survey (1070) of the organization of social science research in 
higher education dealt briefly with the aid received from foundations, 
while Ogg (1089) included more detailed descriptions of the part played 
by foundations, endowments, and fellowships in promoting the social 
science research of higher institutions, research organizations, and learned 
societies. 

A list of the foundations, and of their locations and officers, is available 
in the annual educational directory of the United States Office of Educa- 
tion (1098). The annual reports of these organizations include accounts 
of the publications, projects, and activities completed during the preceding 
year, those in progress, and those planned for the future. The “Department 
of Research News” in the Journal of Educational Research at frequent 
intervals reviews the activities of selected foundations, including the major 
projects to which research grants have been awarded. 


American Council on Education 


A very active organization in the sponsorship and supervision of large- 
scale, cooperative studies, and in disbursing research grants from the 
foundations, is the American Council on Education. One of the chief 
functions of the Council is deliberative—to look critically over the whole 
educational scene to discover those issues and procedures which merit 
study. In addition to the functions of research and deliberation, the Council 
provides continuing services in publication, consultation, and participation 
in national organizations and meetings which formulate educational policy. 
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The Council publishes annually a booklet (1054) portraying its history, 
organization, functions, activities, membership, and publications, including 
the names of its committees and commissions. 

The Council’s Committee on Problems and Plans in Education chooses 
the problems to be studied and preserves a balanced program of activity. 
Subcommittees of 1939-40 are concerned with general education, the 
master’s degree, business education, professional education, educational 
research, rural social studies, educational journalism, occupational training 
and vocational adjustment, and responsibility and relations of governing 
boards. The major projects and activities of the Council during 1939-40 
are: American Youth Commission, Financial Advisory Service, Committee 
on Motion Pictures in Education, Commission on Teacher Education, 
Cooperative Study of Secondary School Standards, Committee on Measure- 
ment and Guidance, Committee on Student Personnel Work, Committee on 
Cooperative Study in General Education, Committee on Implementation 
of Studies in Secondary Education, Committee on Modern Languages, 
Advisory Committee to the National Resources Planning Board, Committee 
on Government and Educational Finance, and Committee on School Plant 
Research. A number of these projects will be described briefly in the follow- 
ing sections characterizing the research work of selected national commis- 
sions and committees. The numerous activities and projects of the American 
Council on Education are reported at frequent intervals in the Department 
of Research News of the Journal of Educational Research, and in the 
Council’s journal, Educational Record. 


Research Commissions and Deliberative Committees 


The decade of the 1920’s marks the beginning of a period of large- 
scale cooperative investigation, usually under the direction of national 
commissions or committees. This movement has been accelerated during 
the decade of the 1930’s. Among the earlier surveys or studies, in addi- 
tion to those sponsored by the United States Office of Education and 
the Research Division of the National Education Association (which are 
reported in the following chapter), are those dealing with educational 
finance (1068), teacher training (1063), adult education (1058), child 
health and protection (1101), motion pictures and youth (1064), character 
education (1065), genius (1095), social trends (1090), modern foreign 
languages (1072, 1096), and Latin (1051, 1080). Limitations of space do 
not permit a description of these earlier studies. Only brief characteriza- 
tions may be given of selected projects completed or in progress during 
the period 1937-39. Fuller descriptions are available in the following 
works (1059, 1060, 1061, 1062, 1071, 1082, 1087), and at frequent intervals 
appear in the “Department of Research News,” Journal of Educational 
Research; certain of the subsequent summary statements have been adapted 
from such sources. It will be recognized that certain of the reports are of 


471 





j 
:s 
ai 
ae 


Review oF EpucaTionaL RESEARCH Vol. IX, No. 5 





the deliberative, evaluative type rather than of a strictly research character, 
but they nevertheless possess major significance for crucial issues and 
problems in education. 

The Committee on the Orientation of Secondary Education presented to 
the Department of Secondary-School Principals, National Education 
Association, its two reports on the issues (1057) and functions (1056) of 
secondary education. The ten functions were discussed at length and 
approved by the committee in its sessions extending over three years, and 
each function was assigned to an individual member for development. A 
Planning Committee developed a nationwide organization of discussion 
groups to analyze the reports of the orientation committee and to consider 
other important problems in secondary education. 

The Commission on Teacher Education of the American Council on 
Education, formed in 1938 for a five-year period, launched during 1939 a 
“Cooperative Study of Teacher Education” involving twenty higher institu- 
tions and fourteen school systems and groups of school systems (1053). 
The major purposes of this study are to encourage a rapid translation 
into practice of generally accepted knowledge in teacher education and to 
stimulate experimentation, broadly conceived, with programs of teacher 
education and teacher growth in service. It has been assumed that these 
two purposes would be best served by working closely at the outset with a 
limited number of institutions and school systems. 

The Committee on Revision of Standards of the North Central Associa- 
tion (1102) reported its findings in seven volumes, completing its program 
of publication during 1937. This project for the revision of accrediting 
procedures had its inception in the general dissatisfaction that had become 
increasingly manifest with the operation of fixed, quantitative standards. 
The general plan of the investigation involved an analysis of all the infor- 
mation that could be considered as having any logical bearing on the 
quality of an institution. 

After six years of intensive work the Cooperative Study of Secondary 
School Standards (1067), under the auspices of the six regional accrediting 
agencies and the American Council on Education, has completed its work. 
It now has ready a body of materials and procedures for evaluation of 
secondary schools believed to be more valid, flexible, and stimulating to 
improvement than any available in the past. In the development and refine- 
ment of these materials, they were first carefully tested in two hundred 
secondary schools of various sizes and types in different parts of the 
country. As a result of this tryout the materials were extensively revised 
and published as “1938 editions” and again tested in ninety additional 
schools. During the spring and summer of 1939 they were further revised 
and rewritten. The “1940 editions” represent the best judgment and experi- 
ence of the Cooperative Study. It is expected that further revisions will 
not be made for at least five years. 
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As its program of investigation nears the date of completion (1940), 
the reports and studies of the American Youth Commission of the American 
Council on Education have been appearing with increasing frequency. 
Its projects and special inquiries are so numerous and varied as to preclude 
separate mention here, but may be identified by means of a booklet dis- 
tributed by the Commission (1052). 

The Progressive Education Association has Commissions in three sig- 
nificant areas (1091). The Commission on the Relation of School and 
College (eight-year study) expects to complete its work in 1941. The period 
in which the Commission on the Secondary School Curriculum was to 
work ended on July 1, 1939. Consequently this Commission is engaged in 
sending to the publishers reports of its various committees. The general 
report, Reorganizing Secondary Education (1090), is already in print. 
This attempted to state the fundamental point of view that has permeated 
the work of the Curriculum Commission and its committees, and included 
a list of the reports of special committees. The Commission on Human 
Relations has been engaged in a series of projects to provide materials on 
personality and culture, the family, understanding human behavior, litera- 
ture and human relations, life and growth, adolescents and parents, and 
motion pictures. Some of these materials have already appeared in print. 

The Advisory Committee on Education has completed publication of 
its series of nineteen staff studies, on which the report to the Congress in 
February 1938 was based. These staff studies are listed in the general 
report (1097). Legislation based on the recommendations of the Committee 
has been under consideration by the Congress. 

The Educational Policies Commission of the National Education Associa- 
tion and the American Association of School Administrators has now pub- 
lished three fundamental interpretations of the relationship which public 
education bears to our national life. Created in November 1935, to define 
guiding policies for American education, the Commission published its first 
pronouncement, The Unique Function of Education in American Democracy 
(1086), a little more than a year later. Since then it has elaborated the 
concepts of this initial document into two further statements entitled The 
Purposes of Education in American Democracy (1083) and The Structure 
and Administration of Education in American Democracy (1085). Deserip- 
tions of the current projects and activities of the Commission are reported 
bimonthly in a bulletin, Educational Policy. 

The Social Science Research Council published a series of thirteen 
research monographs concerned with as many social problems (including 
education) in relation to the depression. Each author sought to examine 
critically the literature on the impact of the depression in his field for the 
purpose of: (a) locating existing data and interpretations already well 
established, (b) discovering serious inadequacies in information, and (c) 
formulating research problems feasible for study. The monographs dealt 
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with the following topics in relation to the depression: crime, education, 
the family, internal migration, minority peoples, reading habits, recreation, 
religion, rural life, social aspects of consumption, social aspects of health, 
social aspects of relief policies, and social work. The monograph on 
education (1084) prepared by the Educational Pelicies Commission dealt 
with problems in the areas of: historical and comparative education, 
theory and philosophy, student personnel, program of instruction, staff 
personnel, organization and administration, finance and business adminis- 
tration, and professional and scientific activities. 

The sixteen volumes, including the summary report (1055), of the 
Commission on the Social Studies of the American Historical Association 
covered a wide range of problems—the public interest, administrative 
policy, curriculum, method, measurement, teacher personnel, and the like. 
The contribution of this series of reports should not be minimized by 
criticisms which have been offered, to the effect that it: (a) advocates 
indoctrination, (b) undermines the science of education, (c) imposes a 
frame of reference upon the teacher, (d) exceeds its authority in studying 
the administration of education, (e) fails to formulate a curriculum in the 
social studies, and (f) presents ideas impossible of realization. 

The International Examinations Inquiry is still under way, with a few 
reports yet to be completed in England and the Scandinavian countries 
before a final summary volume can be prepared. The reports issued in the 
United States dealt with the several international conferences on examina- 
tions (1081) and with examinations and their substitutes (1074). 

The Carnegie Foundation for the Advancement of Teaching (1077) 
issued a partial report of a ten-year project in the state of Pennsylvania, 
involving the examination of 26,000 high-school seniors and the testing 
of students in nearly fifty Pennsylvania colleges. The chief interest of the 
report centers on the results of an eight-hour examination in the main 
aspects of a general education. This was given to high-school seniors, 
college sophomores, and college seniors. 

The Regents’ Inquiry into the Character and Cost of Public Education 
in the State of New York resulted in eleven volumes, including the general 
summary report (1088). The problems treated embrace educational 
finance, school district organization, evaluation of the elementary and 
secondary schools, adult and higher education, motion pictures and radio, 
teaching personnel and teacher training, federal aid to the state, the 
state education department, vocational education, school health, com- 
munity relations, and civic education. 

In a few instances yearbooks represent the outcome of a planned 
program of research; as a rule they report the deliberations of a com- 
mission or evaluative treatments of problems in a particular area. Deliber- 
ative, evaluative discussions possess real value by way of analyzing trends 
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and identifying issues, even though there may be numerous gaps in the 
research on which the conclusions are based. 

The National Society for the Study of Education contributed a long 
line of distinguished yearbooks, in many instances based on carefully con- 
ducted investigational programs. A booklet (1099) issued by the Society 
in 1926 commemorated a quarter of a century of service to the cause of 
education; it described the organization and growth of the Society, its 
meetings, and the yearbooks. (Each yearbook also includes a list of the 
earlier volumes in the series.) Whipple (1100) more recently contributed 
a chapter which analyzes the 36 yearbooks (67 volumes) of the Society 
published from 1902 to 1936, to reveal the kinds of problems studied and 
the methods of attack. 

Other organizations which issue useful yearbooks include: American 
Association of School Administrators, Department of Classroom Teachers, 
Department of Elementary School Principals, Department of Rural Educa- 
tion, Department of Supervisors and Directors of Instruction, John Dewey 
Society, National Council for the Social Studies, National Council of 
Teachers of Mathematics, and National Society of College Teachers of 
Education. As a rule the current yearbook of an organization lists the 
titles of earlier volumes in the series. Descriptions of current and projected 
yearbooks are reported at intervals in the “Department of Research 
News” of the Journal of Educational Research. 

Selle (1094) described in some detail the activities and publications of 
the numerous departments, commissions, and committees of the National 
Education Association. (The work of the Research Division of the National 
Education Association is treated in the following chapter.) 

It is surprising in a decade of financial pressure that so many large 
educational projects have been organized and successfully financed. A 
possible explanation may be found in the suggestion that philanthropic 
foundations, professional organizations, educational institutions, and indi- 
vidual workers in a time of educational and social maladjustment may 
recognize the urgent need for cooperative attack on current issues (1071). 
Such a coordinated approach does not preclude the exercise of initiative 
and ingenuity in individual problem solving, but emphasizes the fact that 
many of the problems of the social sciences are so vast and complex as to 
yield to nothing less than a program of research, of which a number of 
individual studies may be a part. 
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CHAPTER XIX 


Organized Research in Education: National, State, 
City, and University Bureaus of Research ' 


DOUGLAS E. SCATES 


Tus CHAPTER CONTINUES the discussion of organized research agencies 
begun in the preceding chapter and deals with departments or bureaus of 
research. The treatment is restricted to materials that are available in 
printed or mimeographed form, and does not itself represent a survey of 
the agencies. It may omit mention of a number of actual agencies because 
published accounts of their work have not been found. Owing to the large 
amount of scattered material the present treatise can be little more than a 
guide to the literature—one step in the direction of a systematic study of 
existing research agencies and their history, which is badly needed. 


United States Office of Education 


The Office was established in 1867 primarily to collect information. In 
1933 the vocational education division was established to take over the 
functions and personnel of the Federal Board for Vocational Education. 
On July 1, 1939, the Office was transferred from the Department of the 
Interior to the Federal Security Agency. The staff of the Office in 1937 con- 
sisted of ninety persons in the general division and eighty-six in the voca- 
tional education division. Present services of the Office “are mainly of three 
types: (a) research and informational; (b) advisory and consultative; and 
(c) promotional” (1106:48) . In his recent review of the history, personnel, 
and activities of the Office, Judd (1106) commented: “The educational sta- 
tistics of the United States are more comprehensive than those of other 
countries and on the whole more usable.” “The Office of Education has pro- 
duced an enormous amount of research material of the highest value to the 
American educational system and to the American public” (1106:17, 70). 
Including the vocational education division the Office has conducted 361 
surveys, 47 of which have been nationwide; these survey activities are 
treated in some detail (1106:24-34, 92-93, 115-17). Other activities are out- 
lined in Chapters IT, III, and Appendixes A and B of Judd’s report. 

The most recent treatment is a short one by the Educational Policies Com- 
mission (1107) which outlined the history and organization of the Office 
and gives a picture of its current activities, including new projects financed 
from emergency relief funds. A very brief but recent statement was given 
by the Phi Delta Kappan (1121); Chapman (1135:108-11, 135-38) and 
Monroe and others (1237:66-67) also referred to it. Several descriptive 
statements have been prepared and issued by the Office itself or by members 
of its staff. A pictorial and graphic presentation of the work was published 

1Bibliography for this chapter begins on page 635. 
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in 1938 (1118). A brochure giving facts on the origin, history, organization, 
activities, and recent publications was issued in 1935; a revision is in press 
(1116). Segel (1109) , in 1936, and Cooper (1104), in 1933, issued inform- 
ative statements. In 1923 the Brookings Institution (1110) devoted one of 
its Service Monographs of the United States government to a description of 
the Bureau of Education—which it was called prior to 1929. Some 60 pages 
of the report deal with history and 40 pages with activities. Organization is 
treated briefly. An appendix of 50 pages cites laws, expenditures, publica- 
tions for 1920-21, and a bibliography of works about the Bureau. Certain 
reorganizations in functions and in personnel have taken place since this 
report was prepared. 

For recent and current studies the following sources may be consulted. 
The programs of twelve national committees and surveys were described 
in 1934 (1108). The annual reports of the Commissioner of Education con- 
tain much information; until 1918 they contained statistical information 
which is now issued as Biennial Surveys of Education. Since 1933 these 
reports appear only in the Reports of the Secretary of the Interior. School 
Life, the monthly publication of the Office of Education, carries frequent 
reports of projects under way. The “Department of Research News and 
Communications” which appears monthly in the Journal of Educational 
Research contains numerous items on the work of the Office (1105). Other 
references may be found in the Education Index under the head “United 
States—Office of Education,” and in the Readers’ Guide to Periodical Lit- 
erature under the head “United States—Education, Office of.” These last 
four sources include both publications of the Office and articles written 
about the Office and about its publications. The quarterly Journal of the 
American Statistical Association has carried reports on statistical projects 
of the Office in its “Statistical News and Notes” since June 1935. 

The publications of the Office are listed systematically, from the establish- 
ment of the Bureau in 1867 to 1910 (1113), and from 1910 to 1936 (1115). 
These bulletins are kept up to date by cumulative supplementary lists issued 
annually, the latest for the Office in general being 1930-39 (1119) and 
for the vocational education division, 1939 (1120). The 1910-36 list (1115) 
contained the first complete list of the publications of the Federal Board 
for Vocational Education (which began in 1917), and is in general an 
unusually valuable guide. It listed thirty-eight different series of publica- 
tions for the Office and twelve more for the vocational division; it listed 
the annual reports of the Commissioner since 1910; earlier ones are found 
in Index to the Reports of the Commissioner of Education, 1867-1907 
(1111). 

There are also other sources of information on the publications of the 
Office of Education. Certain overlapping lists of publications are useful 
when the two main lists (1113, 1115) are not available: from 1867 to 1907 
(1112) ; from 1906 to 1922 (1124); and from 1906 to 1927 (1123). The 
last two list only the bulletin series, not other publications. There are two 
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lists of Publications Available—one in 1912 (1114) and one in 1930 
(1117). Annual price lists of available materials may be obtained from 
the Government Printing Office (List No. 31, regardless of year) ; these 
arrange publications only by topic—not by author, class, or serial number. 
When all of these sources fail, there is the Education Index back to 1929 
and the Readers’ Guide back to 1906, and the general indexes of govern- 
ment publications, as the Monthly Catalogue of United States Public Docu- 
ments, back to 1895, which is cumulated (back to 1896) into the Document 
Catalogue. There are also other governmental indexes. Witmer and Miller 
(1122) prepared a helpful analysis of the United States Office publications 
up to 1933; this material was extended somewhat and reprinted by Alex- 
ander (1103) in 1935. Their work has been largely superseded by the bulle- 
tin published in 1937 (1115) but their explanatory comments are still of 
value and some of their classifications are unique and helpful. The role 
of the Office in cooperative research is treated later under that head. 


National Education Association—Research Division 


While the history of the National Education Association goes back to 
1857 the Research Division dates from 1922. The regular staff has increased 
from three workers at the beginning to twenty-one persons in 1938-39. The 
Research Division is the general research agency for the Association; it also 
aids directly a number of the separate departments and the special commit- 
tees in carrying on their research and preparing their yearbooks or other 
reports; it assists in the work of the Educational Research Service—an 
informational and research service primarily for school administrators; and 
at the present time it is contributing to the work of the Educational Policies 
Commission through permitting its director to serve as the productive 
secretary of the Commission. 

The activities of the Research Division (1127) have been described as 
falling in four areas of work: (a) research, (b) informational, (c) editorial 
and consultative, and (d) administrative. While the Division produces a 
noteworthy amount of research of its own, it also serves research in a broad 
way by cooperating with a number of other agencies connected with the 
National Education Association in planning, collecting, writing, editing, 
and disseminating information on problems in their own fields of interest. 
A significant part of the energies of the Research Division are devoted to 
making research materials available to the school administrator and school 
teacher working in the field. It is reported as caring for about 5,000 indi- 
vidual inquiries per year, sent “by students, classroom teachers, parents, 
board members, principals, and superintendents.” Two descriptions of the 
work of the Division were available in 1939 (1127, 1128); one was pre- 
pared in 1932 (1130); accounts by Selle (1132) in 1932, and by Ogg 
(1131) and Chapman (1135:111-12, 138-40) in 1928 and 1927 represent 
statements by writers outside the organization. The activities of the Division 
can be followed in detail in its annual reports (1129). 
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The principal publication of the Research Division is the Research Bulle- 
tin, issued five times a year beginning in 1923; the Division also prepares 
the material for the Educational Research Service Circulars, the biennial 
Special Salary Tabulations, and one or more pages in the monthly Journal 
of the N.E.A. The productions are described in some of the references cited; 
Alexander (1125) in his chapter on the publications of the entire National 
Education Association gives a complete list of the Research Bulletins. The 
Education Index, under the head “National Education Association—Re- 
search Division,” lists all the publications issued directly by the Division 
and articles written by the Division or its staff but published in other jour- 
nals. The Readers’ Guide carries the head “National Education Associa- 
tion.” One or two notes have appeared each year among the news items of 


the Journal of Educational Research (1126). 


Research in State Departments of Education 


Systematic discussions, directories, and lists—Chapman (1125) in his 
pioneer study of organized research in 1927 listed five state research 
bureaus and treated them along with other research bureaus of various 
types. The following year he reported on state research bureaus alone, list- 
ing fifteen of them (1134). The United States Office of Education in its 
Educaticnal Directory for 1932 (1153) lists the names of research depart- 
ments and directors for twenty-seven states and Washington, D. C. In subse- 
quent years the names have not been printed in a separate list but have 
been included in the list of Principal State School Officers. Two of the staif 
studies for the Advisory Committee on Education (1137, 1142) make brief 
comments on research work in state departments. Lists of studies made by 
research bureaus of state departments have been prepared for 1929 to 1936 
(with the exception of 1933-34) by the National Education Association 
(1147) and by the Office of Education (1144, 1155). 

The work of individual bureaus—The Division of Research of the State 
Education Department, of the University of the State of New York, has been 
most prominent of all the research bureaus in the literature. We can only 
refer here to several articles (1138, 1139, 1140, 1146, 1148, 1149, 1150, 
1151). It is significant that the organization of the state department as 
modified in 1937 gives the research work the status of a division with an 
assistant commissioner of its own, responsible directly to the commissioner 
and not through an associate commissioner (1149). The Bureau of Statis- 
tical Services is subordinate to the Research Division. The function of exam- 
inations and testing is in a separate division connected with instruction. 
Such an organization appears to give the function of research a position 
which recognizes its potential service. It also presages the time when more 
or less routine testing will generally be distinguished from research in the 
large instead of frequently being regarded as synonymous with it. The his- 
tory of the division and plans for the future are given in (1151). 
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Wood and others (1154) issued annual reports on scholarship testing and 
research in Ohio; for references describing statewide testing programs in a 
number of states see (1182:302-304). Mikesell (1145) described other offi- 
cial research in Ohio. For further reports on the work in individual states, 
one should consult the topic “Departments of Education, State” in the 
Education Index; also articles listed by state name (1141). News items in 
the Journal of Educational Research (1143) cover work in eighteen states. 
We may note also the work of related departments, such as the California 
Bureau of Juvenile Research, and the New Jersey Rehabilitation Com- 
mission. 

Stimulation of field research—One of the functions of a full-fledged state 
department program is to stimulate research by others throughout the state. 
Articles have been written concerning such work by Berning (1133) , Cock- 
ing (1136), and Coxe (1138, 1140). One device for stimulating research 
is the publication of a list of problems needing study in the state (1148, 
1152). 


State Education Associations and Research 


Educational associations are known to do a good deal of publishing but 
it is somewhat difficult to find a significant amount of published research by 
them. Some of their research is done in cooperation with state departments 
or with state universities and does not appear directly under their name. 
Much of their work is published only in mimeographed form, or in state 
journals, and is not readily available. 

A number of states have active research departments in the state organi- 
zation. Ohio has a report on its director (1160, 1285), and an annual list 
of studies by its research department members (1162). Pennsylvania also 
has a research organization and has published some descriptions (1157, 
1163). The Nebraska department has been active since its establishment in 
1936 (1161). The Illinois Education Association has published an occa- 
sional bulletin (1158). The New York educators are split into a number of 
associations, several of which cooperate with the state department in re- 
search and some of which occasionally publish a study of their own. In 
Texas the research department of the state teachers association has pub- 
lished some studies (1164) and a Commission on Coordination in Education 
consisting of public school, university, and state department representatives 
has published a number of studies (1159). Other references will be found 
in Education Index under the above state names. Also, several states have 
related societies such as the California Association for Adult Education and 
the Michigan Educational Planning Commission, which do research. 

Lists of studies undertaken or completed by state educational associations 
have been prepared by the National Education Association (1147) and by 
the United States Office of Education (1144, 1155). The names of direc- 
tors of research in six states were given in 1932 (1153) but have not been 
published in the Educational Directory since. 
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City School Research Bureaus 


Directories and lists of studies—The first list of city research bureaus 
known is that of Nifenecker (1200) in 1918 which contained eighteen cities. 
He noted the difficulty, since encountered by every worker who has attempt- 
ed to list or study research bureaus, in determining when a research bureau 
really could be said to exist. The United States Office of Education has 
issued several publications listing the city research bureaus in 1923 (1170, 
1178), 1924 (1195), and 1931 (1224). Chapman listed 63 in 1927 (1135: 
219-20). A list appeared in the Educational Directory of the Office of Edu- 
cation in 1931 and in 1932 (1153) and research directors have subsequently 
been indicated by code in the annual directories—now in Part II, “City 
School Officers.” Lists of studies undertaken and completed by city research 
departments have been prepared by the National Education Association 
covering 1927-29 (1197, 1198), and by the Office of Education covering 
1929-36 (1184, 1223), with the exception of 1933-34, 

Survey studies of organization, functions, and facilities—While Nife- 
necker made a brief summary in 1918 (1200) the pioneer study and report 
of functions appears to be that of Martens in 1924 (1195). This was fol- 
lowed in 1931 by a second leaflet prepared by Wright (1224). The most 
extended and intimate picture of the conditions surrounding the establish- 
ment of the earlier bureaus and the functions they were expected to dis- 
charge was given by Chapman (1135). While the study was based in part 
upon a questionnaire it presents a detailed story that goes well beyond the 
type of information that questionnaires alone will reveal. Chapman attribu- 
ted the creation of bureaus to several large emphases: (a) the school effi- 
ciency movement—the idea that “the administration of public education 
could profit by utilizing some of the methods developed by industry for im- 
proving efficiency and eliminating waste,” and that there should be “within 
the school system an organization to administer the survey technics and fur- 
nish the superintendent with an accurate evaluation of the status of any 
phase of the schools’ activities” (p. 39); (b) the adjustment movement— 
the conviction that “the school has been made responsible for ascertaining 
the obstacles to learning, with the purpose of bringing about their preven- 
tion or removal” and that “one of the fundamental problems of educational 
research is the devising of means for effecting the best possible adjustment 
of child and curriculum” (p. 54); (c) the measurement movement—the 
“need for setting up machinery to administer standardized tests, to con- 
struct new tests, to promote the use of tests, and to provide training in tech- 
nies necessary for the attainment of that end” (p. 70) ; and (d) what might 
be called the objective movement—the desire for facts, such as those pro- 
duced in surveys, “encouraged the creation of bureaus for gathering and 
organizing educational data” and the provision for “a central office to which 
might be referred all inquiries for information about the school system or 
about educational practice in other cities. . . . During a period of three 
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years, ending in 1924, the New York Bureau answered more than one hun- 
dred thousand queries” (p. 89). Other factors in the organization of bureaus 
were the desire for “coordinating and promoting the research activities of 
principals, teachers, and other school people” (p. 99) and the need for 
studying problems of administration, teaching, curriculum, and progress 
of pupils. 

The foregoing statements indicate the diversity of purposes that have 
led to the establishment of city research bureaus. It should be clear that it 
is necessary to divide bureaus into several classes before making general 
statements about them. The lack of homogeneity of organizations that are 
generally known as research bureaus is one of their outstanding characteris- 
tics. There is little value in reporting certain findings in terms of averages 
—such as “the average amount spent per year on the purchase of tests,” 
when some bureaus do no testing whatever and others do little else. 

Other studies which have reported on the activities and facilities of city 
research bureaus are: Herbst (1185, 1186) in 1930, Kaler (1192) and Hull 
and Maynard (1190) in 1931, Brewton (1171) in 1936, and Witsky (1222) 
in 1938. Carr (1175) reported on salaries for 1935. Scates (1207) made 
a questionnaire study of the “career aspects” of the city school research 
bureau as viewed by the directors. 

Zeigel (1225) made an extensive study of research in secondary schools 
as part of the National Survey of Secondary Education in 1932. He included 
returns from seventy city research bureaus. He concluded that “the re- 
searches made by city bureaus are of a relatively simple fact-finding nature; 
they are studies which require but few technical methods or statistical pro- 
cedures in order to interpret and present the data” (p. 64). “If bureaus of 
research in city systems are to lead the way to a sounder educational pro- 
gram, they will find it necessary to place emphasis on fundamental prob- 
lems of educational practice rather than on the mere compilation and 
publication of facts and statistics.” 

The results which Zeigel and other writers hope for will come when 
the profession generally recognizes that a competent research director is 
something more than a young doctor of philosophy having statistical train- 
ing. If the research director is actually to occupy a place of broad responsi- 
bility he must be a man fitted to carry that responsibility. He must have 
an educational training and experience and an understanding of educational 
problems and processes that will be respected by practical administrators 
and teachers; and he must conceive of research in its broad terms and not 
primarily as a set of mechanical technics—sleight of hand tricks by which 
he can work wonders. He will not be concerned solely with objective facts 
apart from psychological facts, or with theoretical criteria to the exclusion 
of widely held practical criteria. He should be as interested in incorporating 
justified conclusions into practice as he is in determining these conclusions: 
and he will be more concerned about carrying the thinking of others along 
with his own than in bypassing theirs. He must exhibit the same qualities 
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as any able administrator: understanding, orientation, judgment, sense of 
values, personal leadership. When school superintendents come to expect 
this sort of director and when research workers prepare themselves for this 
sort of opportunity the desired progress in school research bureaus will 
follow. 

Descriptions and reports of individual bureaus—A number of research 
workers have published descriptions of the work conducted by their 
bureaus. Usually these are more intimate and full than the information con- 
cerned in survey studies. More such descriptions are to be desired. The 
bureaus covered are: Baltimore, by Stenquist (1212, 1213); New York 
(1173, 1200, 1204) ; Denver (1179) ; Sacramento, by Bursch (1174) ; and 
East St. Louis, by Osborne (1201). The work at Oakland was reported by 
the Department of Superintendence as an illustration (1196); research 
work at Winnetka has also been described (1169, 1218). Other writers 
with direct experience in city research bureaus have prepared more general 
statements on the functions and values of research: Howell (1188), Hughes 
(1189), Sackett (1205), and Theisen (1214, 1215). 

Reports of the work of city research bureaus are frequently included as 
sections of the superintendent’s report for their city. A few research bureaus 
publish reports of their work: Philadelphia (1203) and Los Angeles 
(1194) ; a number of others issue mimeographed reports. New York has 
published the findings of the research bureau as one volume of the super- 
intendent’s report (1199). A few research bureaus have been treated in 
city school surveys (1168, 1202, 1217). News items concerning the work 
of twenty-eight research bureaus have been reported in the Journal of Edu- 
cational Research (1183). Printed and mimeographed publications of 
research bureaus will be found under the names of sixteen different cities 
in the Education Index (1181). 

Publicity and research work—Three writers have treated the service that 
the research bureau can render the school system in connection with the 
public relations program: Bain (1167), Scates (1206, 1208), and Tupper 
(1216). 

Organization of bureaus—Several writers have dealt primarily with the 
external and internal organization of research bureaus—with their posi- 
tion in the total administrative scheme of the school system, and with the 
internal working arrangements of the bureau: Keyworth (1193), Sears 
(1210), and Weidemann (1220). A committee of the American Educational 
Research Association (1211) described the desirable organization and 
functions of research bureaus. Other writers have dealt with the subject 
incidentally. Organization for state bureaus is suggested by New York 
(1149). 

Other statements on research bureaus—Articles giving the writer’s con- 
victions on the place, functions, and values of city school research are 
numerous. Space limits forbid a summarization; a few selected articles are 
cited for reading: (1166, 1172, 1176, 1180, 1187, 1191, 1219, 1220, 1221, 
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1328). Primarily as sources of earlier references one may wish to consult 
Cubberley (1177), Barr and Burton (1169), and Good, Barr, and Scates 
(1182). Additional references for the past and for the future will be found 
in the Education Index under the topics “Research Bureaus” and “Research 
Workers” ; and in the Office of Education’s annual Bibliography of Research 
Studies in Education under the heading “Research Bureaus” in the subject 
index. 


Research Bureaus in Universities and Colleges 


Most of the research bureaus in institutions of higher learning are better 
known by their works than by descriptions of their work. They have been 
organized for a number of different purposes: some study the administra- 
tive and instructional work of their own institution; some study field 
problems—usually those in educational systems of the same state; some 
study theoretical problems in general; some are devoted to testing through- 
out the state; some are devoted to child study; some have a variety of 
purposes. 

The earliest known list of research bureaus in universities is that by 
Baldwin and Smith in 1924 (1170); 22 institutions are included. The 
Office of Education Directory for 1932 (1153) contains 39 bureaus in 
universities and colleges and 25 bureaus (including six at Teachers College, 
Columbia University) at teachers colleges and normal schools. The listing 
or indicating of research directors in higher institutions has not been 
continued in these directories. For other sources of information we have 
recourse in the news items of the Journal of Educational Research (1235) 
where the activities of 26 different bureaus are reported, and in the 
Education Index (1232) where the publications of 22 research bureaus are 
listed under their own name following the name of their institution. 

Chapman (1135: -23-38, 112-14) gave a detailed description of the 
establishing of a number of bureaus and is practically the only source of 
information on some of them. Oklahoma University is given credit for 
establishing the first research bureau in education in 1913 (1135, 1241); 
the Bureau of Cooperative Research was founded at Indiana University in 
1914 (1241). Monroe (1237) described the founding of the bureau at the 
University of Illinois in 1918, and gave detailed information on the history 
and organization of the Institute of Educational Research at Teachers 
College, Columbia University, which was established in 1921, to include 
the division of (educational) psychology, the division of school experi- 
mentation (1226), and the division of field studies. The Institute of 
Practical Arts Research and the Child Development Institute were later 
inaugurated but discontinued; a curriculum laboratory and a guidance 
laboratory are now maintained. Notes on the work of the Institute will be 
found in many issues of the Teachers College Record and in the annual 
reports of the dean of Teachers College. 
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The Bureau of Educational Research at Ohio State University has pub- 
lished a number of annual reports (1227, 1228, 1229, 1230), one of which 
(1228) is notable as an evaluation of research activities. The last ten years 
of the work of the Iowa Child Welfare Research Station, which began in 
1917, were described by Stoddard (1242) and a decade of research and 
service at Kansas, 1920-30, was described by O’Brien (1239). A number 
of writers have discussed university research bureaus in somewhat general 
terms, based on their experience with particular bureaus. Frasier and 
Whitney (1234), Whitney (1243), Flory (1233), Mort (1238), Griffith 
(1236), and Scroggs (1240). 

Apart from published descriptions one feels constrained to mention a 
number of research bureaus and departments; for example, the child study 
laboratories at the universities of California, Chicago, Cincinnati, Iowa, 
Michigan, Minnesota, Toronto, and Yale; the division of surveys and field 
studies at the George Peabody College for Teachers; the curriculum 
laboratory at the same institution—which has issued a typed list of 47 
curriculum laboratories in universities and public school systems, some 
half dozen of which are probably doing genuine research work in addition 
to or in lieu of immediate production; the statistical laboratory at the 
University of Chicago, which has issued a series of studies; and the bureau 
of educational research at the University of North Carolina, which is now 
sixteen years old. There are also the well-known testing departments of the 
University of Jowa and the Kansas State Teachers College at Emporia. It 
is also worth noting that 21 university presses are listed in the Education 
Index list of publishers; these represent one phase of research. 

Other sources of information on organized university research include 
the official annual reports of the director or deans; the catalogs or 
announcements of the institutions; and the publication catalogs of the 
various university presses. 


Unorganized Research in Colleges and Universities 


While unorganized research does not fall within the scope of this treat- 
ment it is nevertheless true that institutions of higher learning may be 
regarded as organizations which in their totality exist in large part for 
the purpose of research. There is at least enough truth in this position 
to warrant a brief mention, in passing, of the contributions to research 
which emanate from these institutions. 

The outstanding treatise on research in colleges and universities in the 
social sciences is that by Ogg (1259) in 1928. The work does not deal 
directly with education, but the general attitude toward research of the 
institutions which he describes applies to education as well as to other 
departments. Gee (1252) made a survey in 1934 which supplements that 
of Ogg, bringing parts of it up to that date. The advantages and short- 
comings of research by college faculties were discussed by Spahr and 
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Swenson (1329) and the National Resources Committee (1326: 165-95). 
The latter pointed out eight ways in which faculty contributions are 
important and states that the research which is voluntarily undertaken 
by the faculty “properly originates in the scholarly curiosity of the staff 
members and must be largely free and independent” (p. 176). 

Rosengarten (1262) reported on educational research in 45 institutions, 
with an emphasis upon New York University. Studebaker (1263) discussed 
the research contributions of land-grant colleges, and Kelly (1255) did 
the same for state universities. Upshall (1267) dealt with teachers colleges. 
Three writers—Ellis (1250, 1284), John (1254), and Sullenger (1264) — 
discussed research in urban universities. Ogan (1257, 1258) contributed 
an outstanding account of a college faculty working on its own problems. 
Dunbar (1249) and Wrenn (1271) also pointed out beneficial effects of 
research on the faculty. The Office of Education in 1931 published a 
bulletin on university problems (1266); since that time the literature 
dealing with research on higher education has expanded enormously. 
Baehne (1245) edited a book dealing with the applications of tabulating 
machine equipment in universities for both routine and research work. 

For current sources on faculty research one should consult issues of 
School and Society and the Journal of Higher Education, the official pub- 
lications of many institutions which list each year the research completed 
by their staff; the Bibliography of Research Studies in Education, issued 
annually by the Office of Education, and the Education Index (topic: 
“Research—colleges and universities,” or other institutional level desired). 
Many articles will be found dealing with such matters as the function and 
value of research by the faculties, grants and funds for research at various 
institutions, opportunities for research, the equipment which is essential 
for research, problems of administering research, and the like. 

Thesis research—It is unnecessary here to cite individual lists of theses 
or series of thesis abstracts since this material, scattered as it is, has 
already been well covered. Palfrey and Coleman (1260) in 1936 is the 
most comprehensive source but covers all fields. Monroe and Shores 
(1256) also in 1936 is the best source for education theses (topic: “Dis- 
sertations” and “Education, Serial Bibliographies”). Derring’s treatment 
(1248) in 1933 was helpful; Good, Barr, and Scates (1182: 122-23, 
152-53) cited general sources up to 1936; Alexander (1125: 157-58, 
224-29) gives helpful and specific advice on material up to 1935. All 
these sources may be supplemented by current issues of the Education 
Index (topic: “Dissertations, Academic—Bibliography”), by issues of the 
Bibliographic Index (topic: “Dissertations, Academic”), and by the 
Office of Education annual Bibliography of Research Studies in Education 
(topic: “Research, Educational—Reports”). Witmer (1270) in 1932 and 
Heyl (1253) in 1939 have dealt with the sources of information about 
research which is in progress. Several theses have been devoted to the 
study and analysis of the technics and materials used in other theses (1246, 
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1247, 1265, 1269). Ernst (1251) and Raeder (1261) have written on the 
value of undergraduate theses as a research stimulus. Further material on 
theses and general university research is contained in the two following 
sections. 


Cooperation and Coordination in Research 


Educational research during the past ten years has included large-scale 
undertakings never before experienced (1286, 1314). The chief sponsors 
of these enterprises have been the Office of Education and the foundations 
and societies described in the preceding chapter. We must recognize also 
the great increase in bibliographical and summarizing services of recent 
years—the Education Index, the Office of Education Bibliographies, the 
National Education Association Research Division summaries, the works 
of such bibliographers as W. S. Monroe and C. V. Good, the Review of 
Educational Research, and others. There are definite and salutary ten- 
dencies to engage the energies of large numbers in attacks upon many 
problems, to integrate scattered attacks, and to make the total program of 
research more systematic, in addition to the general increase in the quantity 
of research studies. 

Cooperation in research may be of several kinds, for several purposes. 
Since the Office of Education is without legal authority to require con- 
formity, its enormous statistical studies are plagued by the idiosyncracies 
and individual volitions of the state departments, cities, and individual 
schools from which it seeks to obtain uniform data. In making a concerted 
attack upon the problem of uniformity in reporting, the Office has engaged 
in a significant type of leadership in a cooperative enterprise. Judd’s 
report (1106: 22-24, 86-94, 97-103) referred to several types of coopera- 
tion and coordination in research which the Office has been able to engage 
in through the opportunities afforded by special funds. We may note the 
five nationwide surveys, the cooperative research in universities project, 
and the study of local units. These types of leadership and followership 
promise much for the advancement of education—if they can be continued. 
They appear to depend upon extraordinary income. Opportunities for 
national leadership were discussed by Cushman and Fox (1282), Cooper 
(1280), Chapman (1135: Chapter 13), and the Department of Superin- 
tendence (1196: 319-21). 

Statewide cooperation has been frequently mentioned as one of the 
objectives of state department research bureaus (1138, 1148, 1279, 1285, 
1292). These purposes were referred to earlier under the section on state 
departments of education. Statewide cooperation with university research 
was discussed by Ashbaugh (1275), Chapman (1135: 33: Chapter 4), and 
Crowley (1281), while a variety of forms of university research coopera- 
tion were presented by Ellis (1284), Kelly (1288), McGrath (1289), 
National Resources Committee (1326: 13-18, 53-58), and Rosen- 
garten (1262). 
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Cooperation in the study of secondary-school problems, sometimes 
involving the high-school principals and teachers, was written on by 
Bristow (1277), Davis (1283), Jessen (1287), O’Brien (1291), and 
Proctor (1293). An emphasis upon involving public school workers, 
whether elementary- or high-school, in research programs has been made 
by Ascher (1274), Barr and Burton (1169: 385-402), Brownell (1278), 
Chapman (1135: 101-105), and Toops (1294), as well as by a number 
of writers previously cited. 

Monroe (1290) was one of the early workers to advocate cooperation 
in research (1911). A general survey of research in 1923 (1170) reported 
a certain amount of cooperative work. Good, Barr, and Scates (1182: 
742-47) summarized various points of view and cite references to further 
literature. 

Undoubtedly we need more large-scale research than we have yet seen; 
but the types of problems which can be solved by an army of workers 
following the directions of a few persons are limited. Unquestionably we 
need more planning in research than we have yet had; but planning must 
reckon with the interests and the drives of the individual worker. The 
long-run strength of research will continue to lie in the diversified centers 
where capable workers carry on individually or with a moderate following. 
In conceiving of an ideal program of research for the nation we must 
think of a balanced program having a certain proportion of large enter- 
prises, a certain proportion of individual effort, and a large portion of 
work by moderate sized groups working in a relatively uncoordinated 
fashion. In seeking to work out such a program there is room for national, 
state, and city planning bodies to sense needs, formulate problems in terms 
of the needs, and direct attention to the formulations. Some kind of 
machinery which would facilitate voluntary cooperation by widely 
scattered groups with common interests should also be found. But any 
material degree of regimentation would be more regrettable than the 
present state. It is not correct to assume that all desirable research is of 
the large organization type, or that efficiency in our forward march 
requires that every separate study fit into some preassigned niche in an all- 
enveloping master plan. 


Training for Research Work 


Germane to all organized research is the matter of training workers to 
do research. Like other problems of instruction, this involves questions of 
objectives and of method. Statements on the nature of the work to be 
done and on the qualities and characteristics which are desirable in the 
research worker were made in particular by Buckingham (1298, 1299). 
Hosic (1304), Scates (1306), Whitney (1315), Withers (1316, 1317), 
and the American Home Economics Association (1295). Studies reporting 
on the amount of training now possessed by directors of research include 
most of the surveys referred to in connection with city school research 
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bureaus. A survey of courses available for training workers was made 
by Tyler (1312); criticisms by research workers of the training they had 
received were also analyzed by Tyler (1313). Smith (1308) dealt with 
the growing demand for research workers. 

The subject of training research workers, together with the pertinent 
literature up to 1936, was covered in a chapter of Good, Barr, and Scates 
(1182: 708-68), and for that reason comment will not be extended here. 
We may merely note that writers who have contributed pointed articles 
on the problem include the following: Barr (1296), Crawford (1300, 
1301), Good (1302, 1303), Sears (1307), Stephan (1309, 1310) , Symonds 
(1311), and Walker (1314). Boyd (1297) wrote at some length from the 
background of physical science research. 


Organized Research in Other Fields 


Research in education is part of a broad attack upon problems along 
all fronts of knowledge which is characteristic of our times. We should 
view educational research not as a solitary enterprise but as one coordinate 
field of activity in man’s general effort to satisfy his curiosity and to 
improve his condition. We are accordingly interested in research in other 
fields as a background and as a basis for comparison in our thinking. 
The social sciences, government, and industry will be referred to here. 


Social sciences—Although the limits of the social sciences will be vari- 
ously set by different workers we may say that this field has been covered 
generally by three books already cited: Ogg (1259) in 1928, Gee (1252) 
in 1934, and Spahr and Swenson (1329). To enter into further detail con- 
cerning such areas as psychology and sociology would carry us too far 
afield. 

Research in government—The National Resources Committee (1326) 
has dealt at length with the amount and character of research which the 
federal government is carrying on. It lists thirteen fields of research, several 
of which are closely related to public education (p. 9). It reports that nearly 
30,000 government employees are in professional and scientific work (not 
necessarily research) and that in 1936-37 the government spent about $65,- 
000,000 from regular funds and about $20,000,000 from emergency funds 
for research of various kinds (p. 8). Another estimate yields a higher 
figure. According to official reports the government spends over $30,000,- 
000 per year on agricultural research and experimentation alone (1318, 
1322). Spahr and Swenson (1329) discuss international, federal, state, 
and municipal research. 

Research in industry—The National Resources Committee expects to 
report later on research by industrial laboratories and by business organiza- 
tions. It comments at present: “Industrial and commercial concerns have 
become so keenly aware of the importance of scientific research in antici- 
pating and meeting competition that many of them maintain staffs of 
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trained scientists and invest liberally in the support of investigations made 
by these scientists” (1326:5). The Committee quotes Hamor (1321), assist- 
ant director of the Mellon Institute of Industrial Research, in estimating 
that in 1937 25,000 persons were employed in industrial research and that 
$100,000,000 was spent for this purpose. Other statements on the amount 
spent run to twice this figure (1217, 1318, 1322). There is a handbook 
describing industrial research laboratories (1325) and a list of scientific 
and technical societies (1324). For further literature one may consult the 
topic “Industrial Research” in the /ndustrial Arts Index. 

Articles dealing with various phases of education and industrial research 
have been written by Ayres (1319), Baird (1320), Mavis (1323), Potter 
(1327), Scates (1328), and Woods (1330). Spahr and Swenson (1329) 
also dealt with industrial research, and Boyd (1297) gave a general descrip- 
tion of the nature (and romance) of research in industry. 

Research areas that have not been included—There are a number of 
organizations carrying on educational research that have not been treated 
here. These include private schools, various state and regional education 
associations having somewhat special purposes, and special state, city, and 
voluntary organizations or institutes such as those for psychological pa- 
tients, for physical rehabilitation, for vocational readjustment, and for 
juvenile delinquency and guidance. As related particularly to industry, the 
research on business and commercial activities and problems was omitted, 
although there are numerous bureaus in universities. 


Concluding Statement 


Research bureaus in education are a product of the past thirty years. 
They have done much to advance research; but there is need for further 
clarification of their role in field practice. The diversity of hopes and expec- 
tations which have characterized their establishment indicates the need for 
a wide variety of technical services in educational systems. It will be well 
if administrators, while providing for these technical services, will recog- 
nize that a certain proportion of time should be set aside for making studies 
of the less routine sort which have more far-reaching significance and more 
permanent value. It will be well also if research is looked upon in each 
individual school system or state not as a narrow activity confined to a 
single area of education, but as one phase of each area, and provision made 
accordingly. Research is larger than statistical work; it is something more 
than testing. It is a continuous fact-finding, exploring, investigating service 
applicable to all aspects of education—administration, business manage- 
ment, finance, schoolbuilding, transportation, curriculum, testing, instruc- 
tion, and psychological and sociological principles. The fact that research 
is often identified with only one phase of education means that educators 
are failing to take advantage of the benefits which current educational 
research stands ready to afford. 
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to the first allusion in running (perhaps intermittent) discussions. One should scan 
several pages following the one cited. 


Activity programs, equipment, 527 

Administration, history of, 459 

American Council on Education, 570 

Analysis of variance, 545, 561 

Appraisal, of courses of study, 525; of in- 
stitutions, 523; of instruction, 525; of 
instructional material, 526; of school 
buildings, 527; see also evaluation; 
measurement; rating; textbook apprai- 


sal 
Aptitude, for college, 517; for high school, 
517; measurement, 517 
Assumptions, 559; in research, 468 
Attitudes, 471; social, 520 


Behavior patterns, 473 

Bibliographical work, aids, 453, 457; see 
also historical method; legal research; 
microphotography 

Biographies, 457, 484 

Bureaus of research, see organized re- 
search 

Business, status and trends, 540 


Case study, 483; in instruction, 486 

Causation, 556 

Checklists, for behavior, 474; for courses 
of study, 525; for institutions, 524; for 
instruction, 525; for parent-child rela- 
tionships, 487; for special education, 
525; see also rating 

City school systems, bureaus of research, 
581 

Classification of cases, 467, 473 

Clinical approach, 483 

Colleges and universities, accrediting, 512, 
536; bureaus of research, 584; faculty 
research, 585; see also higher educa- 
tion 

Cooperative research, 571, 587 

Cooperative Study of Secondary School 
Standards, 572 

Correlation, technic, 547 

Cost, trends, 534; of living, 535, 538 

Criticisms, of research, 502 

Curriculum making, frequency studies, 
466 


Definitions, in research, 474, 556 
Delinquency, development of delinquent 
careers, 488; treatment programs, 485 


Depression, effects on education, 573 
Documentary analyses, quantitative, 466 


Educational Policies Commission, 573 


Electro-galvanometric studies, 492, 495, 
565 


English, errors, 470 

Environment, 493 

Errors, 470; of measurement, 548; see 
also English, errors 

Evaluation, of educational outcomes, 521 

Evaluation Staff, 521 

Experimentation, classroom, 555; labora- 
tory, 564; technic, 551, 555, 564 

Eye movement studies, 564 


Factor analysis, 515, 528; history, 528 

Factor clusters, 488 

Federal government, research by, see re- 
search, in government; U. S. Office of 
Education 

Federal support of education, 534 

Field research, 580, 587 

Follow-up studies, 485 

Foundations, philanthropic, 569 


Frequency studies, see documentary analy- 
ses 


Genetic research, 491 

Graduate education, rating, 537 

Growth and development, technics of 
study, 491; curves, 493 


Heredity, 493 

Higher education, buildings, 512; surveys, 
511; see also colleges and universities; 
graduate education 

Historical method, 456; see also biblio- 
graphical work; documentary analyses 

History of education, 457 

Hollerith, see tabulating machines 

Home, 492 

Homogeneity of data, 582 

Hypotheses, 551 


Index numbers, 522 
Industry, see research, in industry 
Infants, growth, 492 


Intelligence, and environment, 493, 515; 
changes in, 515 
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it 516; individual, 514 

Ne Interpretation, 466 

cha Interviews, 484, 498; in research, 500; in 
AB: teaching, 501; reliability and validity, 
Bilis 498 








Laboratory studies, 564 



















; Laterality, 567 
cig Legal research, methods, 460; see also 
if bibliographical work 
$14 Legislation and court decisions, 459, 461; 
| needed research, 463 
HG Library procedures, see bibliographical 
he work 
he 
4 4 Measurement, 514, 574; bibliography, 520; 
. criticisms of, 523; philosophy of, 520; 
if: see also norms; reliability; scaling; 
ay tests and scales; validity 
| Mechanical aptitudes, 518 
a Microphotography, 455 
is Motion pictures, frequency analysis, 467 
4 a Motor abilities, 493 






ay National Education Association, Research 
| Division, 578 
i 
i 
| 
t 






Needed research, legal aspects, 461 

‘Hes Negro education, support of, 570 

Norms, 512; factors affecting, 492; use 
of, 495 

North Central Association of Colleges and 
Secondary Schools, 512 

Note taking, see bibliographical work 











; Objectives, broadened, 522 

a Objectivity, of direct observation, 474 
Observation, as a research technic, 472; 
eye reliability, 478; see also recording of 
4 observations 

Organized research, 569, 576 










Personality, development of, 494; meas- 
: urement, 518 
‘aif Photographic recording, 474, 566; roent- 
a: genographs, 494; see also eye move- 
Ale ment studies; microphotography 
aM Physical education, measurement, 513; 
fatn surveys, 513 
z Physical development, curves, 493; predic- 
| tion, 493 

1 Prenatal development, 492, 494 
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Professional aptitudes, 518 
Profiles, 536 


Public relations, and research, 583 


Questionnaires, 509; studies of, 502 


Rating, 520; of pupils, 526; of teachers. 
526; scales, 526; see also appraisal: 
checklists; score-cards 

Reading, and visual ability, 567; diag. 
nosis, 565; difficulty of material, 47] 

Reasoning, studies of, 492 

Recording of observations, 474; see also 
photographic recording; sound record- 
ing; stenographic recording 

Records, pupil personnel, 484 

Reliability, calculation of, 549; concept 
of, 521, 552; of direct observation, 477; 
of interviews, 498; of laboratory instru- 
ments, 565, 568; of questionnaire data, 
502 

Research, in government, 589; in industry, 
589; see also assumptions; bibliograph- 
ical work; case study; cooperative re- 
search; documentary analyses; experi- 
mentation; field research; genetic re- 
search; historical method; interviews: 
laboratory studies; legal research; 
methods; microphotography; observa- 
tion; organized research; question- 
naires; rating; statistical methods; sur- 
veys; tabulating machines; theses 


Sampling, 468, 478, 541, 553, 559 

Scaling, 554; case histories, 487, 521 

School buildings, appraisal, 512, 527 

Score-cards, 512, 527; see also rating 

Scoring, of direct observations, 476 

Social background of education, 458 

Social patterns, 486 

Social pressure on schools, 458 

Social Sciences, research in, 589 

Social surveys, 508; methods, 509 

Sound recording, 474, 499, 564 

State education associations, research, 580 

State school systems, bureaus and de- 
partments of research, 579, 587; evalu- 
ation, 524, 535 

Statistical methods, 543; see also analysis 
of variance; correlation; factor anal- 
ysis; index numbers; research; sam- 
pling; scaling; tests of significance; 
weighting 
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Stenographic recording, 474 

Superstitions, 503 

Surveys, 508; bibliography, 511; city, 
512; higher education, 511; history, 
511; methods, 511; see also social sur- 
veys 


Tabulating, 469 

Tabulating machines, 470, 554 

Teacher education, 572 

Teaching load, 537 

Tests and scales, 514, 520; construction 
of, 522; see also measurement 

Tests of significance, 551, 559 

Textbook appraisal, 526 

Theses, 586 


Time studies, 476 
Training of research workers, 588 


United States Office of Education, 511, 
576, 587 


Unit, observation, 474 


Validity, of interviews, 499; of question- 
naire data, 504; technic of validation, 
554; technics for validating, 487; tech- 
nics, 521 

Visual defects, and reading, 568 

Vocabulary, frequency studies, 468 

Vocational aptitudes, 518 

Vocational interests, 519 


Weighting, 538, 541, 549 
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Ability grouping, 178, 360; see also excep- 
tional children 

Absence, see attendance 

Accounting, child; see records, reports, 
school census, and cross references un- 
der school population 

Achievement, and intelligence, 156; and 
promotion, 170; prediction of, 156, 200; 
survey of, 156; see also study, meth- 
ods; success in school 

Activity curriculum, history, 340 

Activity programs, 302; equipment, 527 

Administration, history of, 459 

Adapting instruction to pupils, 177 

Adjustment, 150; diagnosis, 215; inter- 
views, 212; mechanisms, 288; scales, 
198; see also personality; environment 

Adolescence, 290; mental development, 
38; pubescence, 72 

Adult education, Americanization, 354; 
guidance, 192; history, 352; see also 
forums; illiteracy 

African education, 398; finance, 406 

Age-grade status, see progress in school 

Age-height-weight, see nutrition; physical 
development 

Agricultural education, 408; bibliography, 
409; see also vocational education 

American Council on Education, 570 

Americanization, see adult education 

Analysis of variance, 545, 561 

Anecdotal records, 174 

Anthropometry, 80 

Antisocial behavior, 183; see also behavior 

Appraisal, of courses of study, 525; of 
education in China, 386; of guidance 
programs, 196-220; of institutions, 523, 
524; of instruction, 525; of instructional 
material, 526; of marking systems, 172; 
of school buildings, 527; of teaching, 
see rating of teachers; see also evalua- 
tion, measurement, rating, textbook ap- 
praisal 

Aptitude, for college, 517; for high school, 
517; measurement, 517; see also voca- 
tional aptitude 

Arithmetic, development of concepts, 32, 
289 

Art, ability, 34; see also esthetic develop- 
ment 

Assumptions, 559; in research, 468 

Attendance, absence, 162; amount, 164; 
legal aspects, 167; services, 161; see 
also pupil personnel services, school 
census 
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Attitudes, 471; social, 520; survey of, 150, 
153, 157 

Atypical children, see exceptional chil. 
dren 

Auditory aids, 301 

Autobiographies, see records 


Behavior, antisocial, 183; patterns, 473. 
problems, 151; rating, 200; see also 
adjustment; personality 

Bibliographical work, aids, 453, 457; see 
also historical method; legal research: 
microphotography 

Bibliographies, adult education, 417; case 
method, 599; characteristics of pupil 
population, 221; classroom experimen. 
tation, 629; comparative colonial edu- 
cation, 440; comparative school finance, 
441; comparative vocational education 
and guidance, 443; concepts of educa- 
tion in Czechoslovakia, 432; current 
historiography, 593; development of 
motor functions and mental abilities in 
infancy, 111; direct observation as a 
research method, 597; education and 
social trends, 419; education in the 
ancient world, 436; educational changes 
in Germany, 1936-1939, 428; educa- 
tional research in Latin America, 427; 
elementary education, 413; factor anal- 
ysis, 619; general methods of teaching, 
327; genetic method, 602; higher edu- 
cation, 415; history of education in the 
British Commonwealth of Nations, 421; 
history of education in the Far East, 
433; index numbers and related com- 
posites, 622; the interview, 607; labora- 
tory investigations, 630; legal research 
in education, 594; library and biblio- 
graphical procedure, 591; mental and 
motor development from two to twelve 
years, 114; mental development in ado- 
lescence, 125; motivation, emotional re- 
sponses, maturation, intelligence, and 
individual differences, 321; organized 
research in education: foundations, 
commissions, and committees, 633; or- 
ganized research in education: national, 
state, city, and university bureaus of 
research, 635; physical growth from 
birth to maturity, 125; physiological 
factors and mental development, 136; 
preschool education, 412; programs of 
guidance and counseling, 236; quanti- 
tative analysis of documentary materi- 
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als, 595; questionnaires, 608; rating 
scales, score-cards, and checklists, 617; 
relationships in physical and mental 
development, 134; school and commu- 
nity surveys, 609; school organization 
and classroom adjustment, 227; second- 
ary education, 413; statistical methods, 
626; supervision, 328; technics of 
guidance and counseling, 240; technics 
of research in physical growth and an- 
thropometry, 133; testing: intelligence, 
aptitude, personality, achievement, 610; 
theoretical aspects of learning and 
transfer of training, 312; types of learn- 
ing and general conditions affecting 
learning, 318 

Bibliography, criticisms of, 523 

Bilingual pupils, and intelligence, 25 

Bilingualism, and intelligence, 293 

Biographies, 348, 457, 484 

Birth-rates, 26 

Blind, 182 

Bureaus of research, see organized re- 
search 

Business, status and trends, 540 


Canadian education, finance, 404; history, 
365 

Case study, 206, 209, 483; clinical coun- 
seling, 214; in instruction, 486; needed 
research, 212 

Causation, 556 

Census, see school census 

Central American education, see Latin 
American education 

Checklists, for behavior, 474; for courses 
of study, 525; for institutions, 524; for 
instruction, 525; for parent-child rela- 
tionships, 487; for special education, 
525; for supervision, 305; see also 
rating 

Child accounting, see records; reports; 
school census; and cross references un- 
der school population 

Child study, see infants 

Chinese education, finance, 404; history, 
384 


Cinema, see photographic recording 

City school systems, bureaus of re- 
search, 581 

Classical education, 391 

Classification of cases, 467, 473 

Clinical approach, 483 


College, admission, 200; orientation, 217; 
see also particular topic; achievement; 
higher education; prediction 

Colleges and universities, accrediting, 512, 
536; bureaus of research, 584; faculty 
research, 585; see also higher educa- 
tion 

Colonial education, 395 

Comparative education, 361; bibliogra- 
phies, 401; see also foreign education 

Compulsory attendance, see attendance 

Conduct, see behavior 

Contemporary problems, 360 

Cooperative research, 571, 587 

Cooperative Study of Secondary School 
Standards, 572 

Correlation, technic, 547 

Cost of living, 535, 538; trends, 534 

Counseling, see guidance 

Court decisions, on attendance, 167 

Crippled children, 182 

Criticisms, of research, 368, 502 

Culture variations, problems, 370 

Cumulative records, see records 

Curriculum making, frequency studies, 
466 

Curves, for mental development, 92; for 
physical development, 49, 52, 79, 89, 
92; of learning, 282; of mental growth, 
39 

Czechoslovakian education, 377, 405 


Deaf and hard-of-hearing, 182 
Definitions, in research, 474, 556 
Delinquency, development of delinquent 
careers, 488; treatment programs, 485; 
see also behavior, antisocial 
Depression, effects on education, 402, 573 
Development, see growth 


Diagnosis, personality and adjustment, 
215 


Diary, see records 

Diet, and growth, 59; see also nutrition 
Differences, see variability 
Documentary analyses, quantitative, 466 
Drawing, development, 289 

Dutch education, colonial policies, 399 


Educational guidance, see guidance 

Educational philosophy, 342, 378 

Educational Policies Commission, 573 

Educational sociology, 379 

Electro-galvanometric studies, 492, 495, 
565 
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Elementary education, history, 340 

Elementary school, see particular topic 

Emotion, 286; see also fear 

Endocrine glands, 71; see also mental de- 
velopment; physical development 

England, see Great Britain 

English, errors, 470 

Environment, 493; and delinquency, 29; 
and intelligence, 11, 21, 29, 41, 98, 
292; and social behavior, 370; socio- 
economic factors, 213; see also adjust- 
ment, family relations 

Errors, 470; of measurement, 548; see 
also English, errors 

Esthetic development, 34; see also art; 
music; poetry; rhythm 

Eugenics, see birth-rates 

European education, see foreign educa- 
tion 

Evaluation of educational outcomes, 521; 
of guidance, 185; see also appraisal; 
rating; tests and scales 

Evaluation Staff, 521 

Exceptional children, 181, 293; see also 
handicapped children; retarded chil- 
dren; special education; superior chil- 
dren; unstable children 

Experimentation classroom, 555; labora- 
tory, 564; technic, 551, 555, 564 

Extensions of education, 359; see also 
adult education; higher education, ex- 
tension; preschool education 

Eye movement studies, 564 


Factor analysis, 19, 25, 36, 79, 290, 515, 
528; history, 528 

Factor clusters, 488 

Failure, see achievement; progress in 
school; promotion of pupils; pupil per- 
sonnel services; success in school 

Family relations, 289 

Fear, 287 

Federal government, research by, see re- 
search, in government; U. S. Office of 
Education 

Federal support of education, 534 

Field research, 580, 587 

Finance, see depression, support of edu- 
cation 

Follow-up studies, 409, 485 

Foreign education, for more specific 
references, see level, subject, or country 

Foreign language, 283 

Foreign-speaking persons, 354 
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Forums, 353 

Foundations, philanthropic, 569 

French education, colonial policies, 397- 
finance, 406 

Frequency studies, see documentary 
analyses 


Genetic research, 491 

Genetic studies, see esthetic development: 
intelligence 

German education, 372; finance, 406: his. 
tory, 372; vocational, 409 

Gifted children, 293 

Graduate education rating, 537 

Graphs, see pictures 

Great Britain, colonial policies, 395; his. 
tory of education, 361, 382; support of 
education, 402, 405; vocational educa- 
tion, 409; see also African education: 
Canadian education; India 

Growth, curves of, 39, 49, 52, 79, 89: na- 
ture of, 78; of infants, 5; of intelli- 
gence, 18; of schools, 163; physical, 
47; technics of study, 80; see also in- 
telligence; physical development 

Growth and development, curves, 493; 
technics of study, 491; see also ado- 
lescence; arithmetic, development of 
concepts; drawing, development; en- 
vironment; language development; 
mental development; physical develop- 
ment; records; social maturity 

Guidance and counseling, 185, 196, 214, 
410; appraisal, 196-220, 419; group, 
217; in college, 187; in other coun- 
tries, 410; needed research, 187; see 
also interview; vocational aptitude 


Handedness, see laterality 

Handicapped children, see behavior, anti- 
social; blind; crippled children; deaf 
and hard-of-hearing; exceptional chil- 
dren; mentally retarded children; 
remedial instruction; special education; 
speech defects 

Health, and physical development, 62; 
see also physical development 

Heart, 69 

Height, see physical development 

Heredity, 493 

High school, see particular topic 

Higher education, 347; bibliography, 347; 
buildings, 512; characteristics of stu- 
dents, 155; extension, 352; finance, 
351; history, 347; in other countries, 
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403; legal aspects, 351; surveys, 511; 
see also colleges and _ universities, 
graduate education 

Historical method, 456; see also biblio- 
graphical work; documentary analyses 

History of education, 337, 457; for more 
specific references, see level, subject, or 
country 

Hollerith, see tabulating, machines 

Home, 492 

Homogeneity of data, 582 

Hypotheses, 551 


Illiteracy, 354; see also adult education 

Incentives, and learning, 30; and test 
scores, 26 

Index numbers, 532 

India, history of education, 381; support 
of education, 406 

Indians, education of, 370 

Individual differences, see variability be- 
tween individuals 

Industry, see research, in industry 

Infants, growth, 5; physical measure- 
ments, 52, 83; physical norms, 47; 
study of, 6 

Infants growth, 492 

Insight, see learning 

Instruction, see activity programs; audi- 
tory aids; learning; methods; supervi- 
sion; teaching success; visual aids 

Intelligence, and birth rank, 100; and 
environment, 11, 21, 41, 98, 291, 493, 
515; and health, 96; and heredity, 292; 
and motor abilities, 96; and physical 
characteristics, 94; and premature 
birth, 97; and race, 27; and schooling, 
42; and season of birth, 98; and trans- 
fer of training, 271; changes in, 515; 
constancy, 19, 22, 40, 291; develop- 
ment, 148; factors, 290; growth, 18; 
of high-school pupils, 42; of infants, 6; 
of training, 271; personal constant, 21; 
sex differences, 42; surveys, 152, 291; 
see also environment; mental defec- 
tives; mental development; mentally 
retarded children; racial differences; 
retarded children; superior children 

Intelligence tests, 271, 514; bibliogra- 
phies, 516; for infants, 6; individual, 
514 

Interests, see attitudes 

Interpretation, 466; of data, 215 

Interviews, 201, 484, 498; in research, 
500; in teaching, 501; needed research, 


204; reliability and validity, 498; see 
also guidance and counseling 


Japanese education, history, 388 
Junior college, statistics, 346 
Junior high school, history, 344 


Kindergarten, history, 338 


Laboratory studies, 464 

Language, development of, 16, 33, 289; 
see also foreign language 

Laterality, 15, 567 

Latin American education, 368; history, 
369; needed research, 368 

Leadership, 153 

Learning, 297; curves, 282; insight, 263; 
laws, 257; needed research, 294; organ- 
ization, 262; practice, 276; psychology, 
255, 274, 285; studies of, 30; whole- 
part, 277; see also problem solving; 
study methods 

Legal research methods, 460; see also 
bibliographical work 

Legislation, attendance, 167 

Legislation and court decision, 459, 461; 
needed research, 463 

Library, 309 

Library procedures, see bibliographical 
work 


Marks, 172; see also records; reports on 
pupils 

Measurement, 514, 574; and incentives, 
26; bibliography, 520; philosophy of, 
520; physical, 80; see also normal relia- 
bility; scaling; tests and scales; valid- 
ity 

Mechanical aptitudes, 518 

Memory, 44, 279; studies of, 30 

Mental defectives, 101 

Mental development, 288; adolescence, 
38; and glands, 104; and nutrition, 
103; and physical development, 91, 103; 
and pubescence, 108; curves, 39, 92; 
factors affecting, 41; of infants, 5; 
see also intelligence 

Mental hygiene, 212 

Mentally retarded children, 183 

Methods of teaching, 295, 360; see also 
study methods 

Mexican education, see Latin American 
education 

Microphotography, 455 
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Mission education, 384, 387, 389, 395, 398 

Motion pictures, frequency analysis, 467; 
see also photographic recording 

Motivation, 285 

Motor abilities, 8, 13, 35, 493; and in- 
telligence, 96; see also physical de- 
velopment 

Music, ability, 34 


National Education Association, Research 
Division, 578 

Nationality, see bilingual pupils 

Nature and nurture, 11, 26; see also en- 
vironment; intelligence 

Needed research, case study, 212; guid- 
ance, 187; handicapped children, 182; 
history of higher education, 348; in 
Latin America, 368; interviews, 204; 
learning, 294; legal aspects, 461; 
mental development, 45; personality 
and adjustment, 160; rating, 201; social 
background of education, 357 

Negro education, support of, 570 

Norms, 169, 512; factors affecting, 492; 
for physical development, 48; use of, 
495 

North Central Association of Colleges 
and Secondary Schools, 512 

Note taking, see bibliographical work 

Nursery schools, 23; history, 338 

Nutrition, 103; see also diet; health; 
physical development 


Objectives, broadened, 522 

Objectivity, of direct observation, 474 

Observation, as a research technic, 472; 
reliability, 478; see also recording of 
observations 

Occupations, see guidance 

Organization of schools, functional, 161; 
see also pupil personnel services 

Organized research, 569, 576 

Orientation, see college 

Out-of-school youth, see youth 


Parent-child relationships, 487 

Persistence in school, 172; see also cross 
references under school population 

Personality, 150, 154, 158; development 
of, 494; measurement, 518; scales, 198; 
see also adjustment; behavior; rating 

Personnel work, see pupil personnel serv- 
ices 

Philosophy, see educational philosophy 
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Photographic recording, 80, 474, 566: of 
infants, 9; of physical development. 
82; roentgenographs, 80, 494; see also 
eye movement studies; microphotog. 
raphy 

Physical development, 47, 80, 147, 15]. 
155, 288; age, height, weight, 74; and 
birth rank, 54; and glands, 105; and 
health, 62; and mental development, 
91; and premature birth, 54; and sea- 
son, 56; curves, 49, 92, 493; norms, 48. 
74; of infants, 5, 13; prediction, 493: 
see also growth; motor abilities; nv. 
trition 

Physical education, history, 392; measure- 
ment, 513; surveys, 513 

Physically handicapped, see handicapped 
children 

Pictures, appreciation of, 34; see also 
photographic recording 

Poetry, 22, 35 

Prediction, of future education, 344; see 
also achievement 

Prenatal development, 492, 494 

Preschool education, history, 337 

Problem solving, 274; see also learning 

Professional aptitudes, 518 

Profiles, 536 

Progress in school, 168; see also promo- 
tion of pupils; success in school 

Promotion of pupils, 168; failure, 171 

Prophecies, see prediction 

Public relations, and research, 583 

Pupil personnel services, 147; organiza- 
tion, 166; see also adapting instruction 
to pupils; adjustment; attendance: 
counseling; guidance; records; school 
census; school population; variability 
between individuals 

Puberty, see adolescence 


Questionnaires, 509; studies of, 502 


Racial differences, 27, 293; in mental 
ability, 10; physical, 56 

Rating, 520; needed research, 201; of 
pupils, 526; of teachers, 199, 526; 
scales, 199, 526; see also adjustment; 
appraisal; behavior; checklists; person- 
ality; score-cards 

Reading, and visual ability, 567; diag- 
nosis, 565; difficulty of material, 471; 
in college, 180 

Reasoning, studies of, 31, 492 
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Recording of observations, 474; see also 
photographic recording, sound record- 
ing, stenographic recording 

Records, autobiographies, 206; biogra- 
phies, 289; pupil personnel, 174, 204, 
212; see also anecdotal records; marks; 
reports on pupils; stenographic records 

Reliability, calculation of, 549; concept 
of, 521, 552; of direct observation, 477; 
of interviews, 202, 498; of laboratory 
instruments, 565, 568; of physical mea- 
surement, 85; of questionnaire data, 502 

Remedial instruction, 180 

Reports, on pupils, 172 

Research, in government, 589; in indus- 
try, 589; in Latin America, 368; in 
other countries, 403; technics in phys- 
ical development, 80; technics of in- 
fant study, 6; trends in psychology, 
266; see also appraisal; assumptions, 
bibliographical work; case study; co- 
operative research; documentary anal- 
yses; experimentation; field research; 
genetic research; historical method; in- 
terpretation of data; interviews; lab- 
oratory studies; legal research; meth- 
ods; microphotography; needed re- 
search; observation; organized re- 
search; questionnaires; rating; rec- 
ords; statistical methods; surveys; 
tabulating, machines; tests and scales; 
theses; particular subject field or topic 

Retardation, see mentally retarded chil- 
dren; progress in school 

Retarded children, 28 

Rhythm, 36 

Roentgenographs, see photographic re- 
cording 

Rural education, in Latin America, 369; 
supervision, 309 


Sampling, 468, 478, 541, 553, 559 

Scaling, 554; case histories, 487, 521 

School buildings, appraisal, 512, 527 

School census, 165; see also cross refer- 
ences under school population 

School population, characteristics, 147; 
number and change, 344, 359; see also 


achievement; attendance; exceptional 
children; growth of schools; persistence 
in school; pupil personnel services; 
progress in school; school census; sur- 
veys; variability between individuals 
Score-cards, 512, 527; see also rating 
Scoring, of direct observations, 476 


Secondary education, history, 342 

Sex differences, 42, 293; in intelligence, 
27, 42; see also physical development 

Size of classes, 179 

Social background of education, 342, 357, 
458; needed research, 357; see also 
contemporary problems 

Social conditions and changes, 357 

Social maturity, 29, 292 

Social patterns, 486 

Social pressure on schools, 458 

Social sciences, research in, 589 

Social surveys, 508; methods, 509 

Socio-economic factors, see environment 

Socio-economic status, see environment 

Sound recording, 311, 474, 499, 564 

South American education, see Latin 
American education 

Special education, 180; see also excep- 
tional children 

Speech defects, 183 

State education associations, research, 580 

State school systems, bureaus and depart- 
ments of research, 579, 587; evalua- 
tion, 524, 535; history, 340 

Statistical methods, 543; see also re- 
search; analysis of variance, correla- 
tion, factor analysis; index numbers; 
sampling; scaling; tests of significance; 
weighting 

Stenographic recording, 310, 474 

Students, see college; higher education; 
school population 

Study methods, 150, 219, 296, 299 

Success in school, factors, 156; see also 
achievement; progress in school; reme- 
dial instruction 

Superior children, 28 

Superstitions, 22, 503 

Supervised study, see methods, of teach- 
ing; study methods 

Supervision, 303; appraisal, 309; criteria, 
305 

Support of education, in other countries, 
401 

Surveys, 508; achievement, 156; bibliog- 
raphy, 511; city, 512; guidance, 185; 
higher education, 511; history, 511; 
methods, 511; vocational, 409; youth, 
192; see also adjustment; attitudes; 
school populations; social surveys 


Tabulating, 469; machines, 470, 554 
Taxation, in other countries, 403 
Teacher rating, see rating 
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Teacher training, 304, 572; in-service, 
306 

Teaching load, 537 

Teaching success, criteria, 296, 304 

Teeth, 60, 64, 88 

Tests and scales, 199, 514, 520; construc- 
tion of, 522; for infants, 6; see also 
appraisal; measurement; norms; relia- 
bility; vocational aptitude 

Tests, effects on study, 266; of signifi- 
cance, 551, 559 

Textbook appraisal, 526 

Theses, 586 

Time studies, 476 

Training of research workers, 588 

Transfer of training, 268 

Trial-and-error, see learning 


Unit, 474 

United States Office of Education, 511, 
576, 587 

Unstable children, 29 


Validity, of interviews, 499; of question. 
naire data, 504; technics, 521; technics 
for validating, 487, 554 

Variability between individuals, 147, 156. 
292; see also exceptional children: 
school population 

Visual aids, 284, 301 

Visual defects, and reading, 568 

Vocabulary, 26, 29, 33, 45; frequency 
studies, 468 

Vocational aptitude, 518; tests, 197 


Vocational education, in other countries. 


408 ; research, 408; see also agricultural] 
education 
Vocational guidance, see guidance 
Vocational interests, 155, 159, 519 


Weight, see physical development 
Weighting, 538, 541, 549 
Whole-part learning, see learning 
Writing, history of, 392 


Youth, 154; surveys, 192; see also ado- 
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